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Foreword 


This book is a collection of notes and unpublished results which I have 
accumulated on the subject of classical field theory. In 1996, it occurred to me 
that it would be useful to collect these under a common umbrella of conventions, 
as a reference work for myself and perhaps other researchers and graduate 
students. I realize now that this project can never be finished to my satisfaction: 
the material here only diverges. I prefer to think of this not as a finished book, 
so much as some notes from a personal perspective. 

In writing the book, I have not held history as an authority, nor based the 
approach on any particular authors; rather, I have tried to approach the subject 
rationally and systematically. I aimed for the kind of book which I would have 
appreciated myself as a graduate student: a book of general theory accompanied 
by specific examples, which separates logically independent ideas and uses 
a consistent notation; a book which does not skip details of derivation, and 
which answers practical questions. I like books with an attitude, which have 
a special angle on their material, and so I make no apologies for this book’s 
idiosyncrasies. 

Several physicists have influenced me over the years. I am especially grateful 
to David Toms, my graduate supervisor, for inspiring, impressing, even depress- 
ing but never repressing me, with his unstoppable ‘Nike’ philosophy: (shrug) 
‘just do it’. I am indebted to the late Peter Wood for kind encouragement, as a 
student, and for entrusting me with his copy of Schweber’s now ex-masterpiece 
Relativistic Quantum Field Theory, one of my most prized possessions. My 
brief acquaintance with Julian Schwinger encouraged me to pay more attention 
to my instincts and less to conforming (though more to the conformal). I have 
appreciated the friendship of Gabor Kunstatter and Meg Carrington, my frequent 
collaborators, and have welcomed occasional encouraging communications 
from Roman Jackiw, one of the champions of classical and quantum field theory. 
I am, of course, indebted to my friends in Oslo. I blame Alan McLachlan 
for teaching me more than I wanted to know about group congruence classes. 
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XX Foreword 


Thanks finally to Tai Phan, of the Space Science Lab at Berkeley for providing 
some sources of information for the gallery data. 

Like all software, this book will contain bugs; it is never really finished and 
trivial, even obvious errors creep in inexplicably. I hope that these do not distract 
from my perspective on one of the most beautiful ideas in modern physics: 
covariant field theory. 

I called the original set of these notes: The X, Files: Covert Field Theory, 
as a joke to myself. The world of research has become a merciless battleground 
of competitive self-interest, a noise in which it is all but impossible to be heard. 
Without friendly encouragement, and a pinch of humour, the battle to publish 
would not be worth the effort. 


Mark Burgess 
Oslo University College 


“The Dutch astronomer De Sitter was able to show that 
the velocity of propagation of light cannot depend on 
the velocity of motion of the body emitting the light... 
theoretical investigations of H.A. Lorentz...lead[s] conclusively 
to a theory of electromagnetic phenomena, of which the 
law of the constancy of the velocity of light in vacuo 
is a necessary consequence.” 


— Albert Einstein 


“Energy of a type never before encountered.” 


— Spock, Star Trek: The motion picture. 


Part 1 
Fields 
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Introduction 


In contemporary field theory, the word classical is reserved for an analytical 
framework in which the local equations of motion provide a complete de- 
scription of the evolution of the fields. Classical field theory is a differential 
expression of change in functions of space and time, which summarizes the 
state of a physical system entirely in terms of smooth fields. The differential 
(holonomic) structure of field theory, derived from the action principle, implies 
that field theories are microscopically reversible by design: differential changes 
experience no significant obstacles in a system and may be trivially undone. 
Yet, when summed macroscopically, in the context of an environment, such 
individually reversible changes lead to the well known irreversible behaviours 
of thermodynamics: the reversal of paths through an environmental landscape 
would require the full history of the route taken. Classical field theory thus 
forms a basis for both the microscopic and the macroscopic. 

When applied to quantum mechanics, the classical framework is sometimes 
called the first quantization. The first quantization may be considered the 
first stage of a more complete theory, which goes on to deal with the issues 
of many-particle symmetries and interacting fields. Quantum mechanics is 
classical field theory with additional assumptions about measurement. The 
term quantum mechanics is used as a name for the specific theory of the 
Schrödinger equation, which one learns about in undergraduate studies, but it is 
also sometimes used for any fundamental description of physics, which employs 
the measurement axioms of Schrödinger quantum mechanics, i.e. where change 
is expressed in terms of fields and groups. In that sense, this book is also about 
quantum mechanics, though it does not consider the problem of measurement, 
and all of its subtlety. 

In the so-called quantum field theory, or second quantization, fields are 
promoted from c-number functions to operators, acting upon an additional 
set of states, called Fock space. Fock space supplants Slater determinant 
combinatorics in the classical theory, and adds a discrete aspect to smooth field 
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theory. It quantizes the allowed amplitudes of the normal modes of the field 
and gives excitations the same denumerable property that ensembles of particles 
have; i.e. it adds quanta to the fields, or indistinguishable, countable excitations, 
with varying numbers. Some authors refer to these quanta simply as ‘particles’; 
however, they are not particles in the classical sense of localizable, pointlike 
objects. Moreover, whereas particles are separate entities, quanta are excita- 
tions, spawned from a single entity: the quantum field. The second-quantized 
theory naturally incorporates the concept of a lowest possible energy state 
(the vacuum), which rescues the relativistic theory from negative energies and 
probabilities. Such an assumption must be added by hand in the classical theory. 
When one speaks about quantum field theory, one is therefore referring to this 
“second quantization’ in which the fields are dynamical operators, spawning 
indistinguishable quanta. 

This book is not about quantum field theory, though one might occasionally 
imagine it is. It will mention the quantum theory of fields, only insofar as to hint 
at how it generalizes the classical theory of fields. It discusses statistical aspects 
of the classical field to the extent that classical Boltzmann statistical mechanics 
suffices to describe them, but does not delve into interactions or combinatorics. 
One should not be misled; books on quantum field theory generally begin with 
a dose of classical field theory, and many purely classical ideas have come to be 
confused with second-quantized ones. Only in the final chapter is the second- 
quantized framework outlined for comparison. This book is a summary of the 
core methodology, which underpins covariant field theory at the classical level. 
Rather than being a limitation, this avoidance of quantum field theory allows one 
to place a sharper focus on key issues of symmetry and causality which lie at the 
heart of all subsequent developments, and to dwell on the physical interpretation 
of formalism in a way which other treatments take for granted. 


1.1 Fundamental and effective field theories 


The main pursuit of theoretical physics, since quantum mechanics was first 
envisaged, has been to explore the maxim that the more microscopic a theory 
is, the more fundamental it is. In the 1960s and 1970s it became clear that this 
view was too simplistic. Physics is as much about scale as it is about constituent 
components. What is fundamental at one scale might be irrelevant to physics at 
another scale. For example, quark dynamics is not generally required to describe 
the motion of the planets. All one needs, in fact, is an effective theory of planets 
as point mass objects. their detailed structure is irrelevant to so many decimal 
places that it would be nonsense to attempt to include it in calculations. Planets 
are less elementary than quarks, but they are not less fundamental to the problem 
at hand. 

The quantum theory of fields takes account of dynamical correlations be- 
tween the field at different points in space and time. These correlations, 
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called fluctuations or virtual processes, give rise to quantum corrections to the 
equations of motion for the fields. At first order, these can also be included 
in the classical theory. The corrections modify the form of the equations of 
motion and lead to effective field equations for the quantized system. At low 
energies, these look like classical field theories with renormalized coefficients. 
Indeed, this sometimes results in the confusion of statistical mechanics with the 
second quantization. Put another way, at a superficial level all field theories are 
approximately classical field theories, if one starts with the right coefficients. 
The reason for this is that all one needs to describe physical phenomena is a 
blend of two things: symmetry and causal time evolution. What troubles the 
second quantization is demonstrating the consistency of this point of view, given 
sometimes uncertain assumptions about space, time and the nature of fields. 
This point has been made, for instance, by Wilson in the context of the 
renormalization group [139]; it was also made by Schwinger, in the early 1970s, 
who, disillusioned with the direction that field theory was taking, redefined his 
own interpretation of field theory called source theory [119], inspired by ideas 
from Shannon’s mathematical theory of communication [123]. The thrust of 
source theory is the abstraction of irrelevant detail from calculations, and a 
reinforcement of the importance of causality and boundary conditions. 


1.2 The continuum hypothesis 


Even in classical field theory, there is a difference between particle and field 
descriptions of matter. This has nothing a priori to do with wave-particle duality 
in quantum mechanics. Rather, it is to do with scale. 

In classical mechanics, individual pointlike particle trajectories are character- 
ized in terms of “canonical variables’ x(t) and p(t), the position and momentum 
at time ¢. Underpinning this description is the assumption that matter can be 
described by particles whose important properties are localized at a special place 
at a special time. It is not even necessarily assumed that matter is made of 
particles, since the particle position might represent the centre of mass of an 
entire planet, for instance. The key point is that, in this case, the centre of mass 
is a localizable quantity, relevant to the dynamics. 

In complex systems composed of many particles, it is impractical to take 
into account the behaviour of every single particle separately. Instead, one 
invokes the continuum hypothesis, which supposes that matter can be treated 
as a continuous substance with bulk properties at large enough scales. A system 
with a practically infinite number of point variables is thus reduced to the study 
of continuous functions or effective fields. Classically, continuum theory is a 
high-level or long-wavelength approximation to the particle theory, which blurs 
out the individual particles. Such a theory is called an effective theory. 

In quantum mechanics, a continuous wavefunction determines the probability 
of measuring a discrete particle event. However, free elementary quantum 
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particles cannot be localized to precise trajectories because of the uncertainty 
principle. This wavefunction-field is different from the continuum hypothesis 
of classical matter: it is a function which represents the state of the particle’s 
quantum numbers, and the probability of its position. It is not just a smeared 
out approximation to a more detailed theory. The continuous, field nature is 
observed as the interference of matter waves in electron diffraction experiments, 
and single-particle events are measured by detectors. If the wavefunction is 
sharply localized in one place, the probability of measuring an event is very 
large, and one can argue that the particle has been identified as a bump in the 
field. 

To summarize, a sufficient number of localizable particles can be viewed as an 
effective field, and conversely a particle can be viewed as a localized disturbance 
in an elementary field. 

To envisage an elementary field as representing particles (not to be confused 
with quanta), one ends up with a picture of the particles as localized disturbances 
in the field. This picture is only completely tenable in the non-relativistic limit of 
the classical theory, however. At relativistic energies, the existence of particles, 
and their numbers, are fuzzy concepts which need to be given meaning by the 
quantum theory of fields. 


1.3 Forces 


In classical mechanics, forces act on particles to change their momentum. The 
mechanical force is defined by 


F=— (1.1) 


where p is the momentum. In field theory, the notion of a dynamical influence 
is more subtle and has much in common with the interference of waves. The 
idea of a force is of something which acts at a point of contact and creates an 
impulse. This is supplanted by the notion of fields, which act at a distance and 
interfere with one another, and currents, which can modify the field in more 
subtle ways. Effective mechanical force is associated with a quantity called the 
energy-momentum tensor 6, Or Tav. 


1.4 Structural elements of a dynamical system 


The shift of focus, in modern physics, from particle theories to field theories 
means that many intuitive ideas need to be re-formulated. The aim of this book is 
to give a substantive meaning to the physical attributes of fields, at the classical 
level, so that the fully quantized theory makes physical sense. This requires 
example. 
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A detailed description of dynamical systems touches on a wide variety of 
themes, drawing on ideas from both historical and mathematical sources. The 
simplicity of field theory, as a description of nature, is easily overwhelmed by 
these details. It is thus fitting to introduce the key players, and mention their 
significance, before the clear lines of physics become obscured by the topog- 
raphy of a mathematical landscape. There are two kinds of dynamical system, 
which may be called continuous and discrete, or holonomic and non-holonomic. 
In this book, only systems which are parametrized by continuous, spacetime 
parameters are dealt with. There are three major ingredients required in the 
formulation of such a dynamical system. 


e Assumptions 
A model of nature embodies a body of assumptions and approximations. 
The assumptions define the ultimate extent to which the theory may be 
considered valid. The best that physics can do is to find an idealized 
description of isolated phenomena under special conditions. These 
conditions need to be borne clearly in mind to prevent the mathematical 
machinery from straying from the intended path. 


e Dynamical freedom 

The capacity for a system to change is expressed by introducing dynam- 
ical variables. In this case, the dynamical variables are normally fields. 
The number of ways in which a physical system can change is called its 
number of degrees of freedom. Such freedom describes nothing unless 
one sculpts out a limited form from the amorphous realm of possibility. 
The structure of a dynamical system is a balance between freedom and 
constraint. 

The variables in a dynamical system are fields, potentials and sources. 
There is no substantive distinction between field, potential and source, 
these are all simply functions of space and time; however, the words 
potential or source are often reserved for functions which are either static 
or rigidly defined by boundary conditions, whereas field is reserved for 
functions which change dynamically according to an equation of motion. 


e Constraints 
Constraints are restrictions which determine what makes one system 
with n variables different from another system with n variables. The 
constraints of a system are both dynamical and kinematical. 


— Equations of motion 
These are usually the most important constraints on a system. They 
tell us that the dynamical variables cannot take arbitrary values; they 
are dynamical constraints which express limitations on the way in 
which dynamical variables can change. 
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— Sources: external influences 

Physical models almost always describe systems which are isolated 
from external influences. Outside influences are modelled by intro- 
ducing sources and sinks. These are perturbations to a closed system 
of dynamical variables whose value is specified by some external 
boundary conditions. Sources are sometimes called generalized 
forces. Normally, one assumes that a source is a kind of ‘immovable 
object’ or infinite bath of energy whose value cannot be changed 
by the system under consideration. Sources are used to examine 
what happens under controlled boundary conditions. Once sources 
are introduced, conservation laws may be disturbed, since a source 
effectively opens a system to an external agent. 


— Interactions 
Interactions are couplings which relate changes in one dynamical 
variable to changes in another. This usually occurs through a 
coupling of the equations of motion. Interaction means simply that 
one dynamical variable changes another. Interactions can also be 
thought of as internal sources, internal influences. 


— Symmetries and conservation laws 

If a physical system possesses a symmetry, it indicates that even 
though one might try to affect it in a specific way, nothing significant 
will happen. Symmetries exert passive restrictions on the behaviour 
of a system, i.e. kinematical constraints. The conservation of book- 
keeping parameters, such as energy and momentum, is related to 
symmetries, so geometry and conservation are, at some level, related 
topics. 


The Lagrangian of a dynamical theory must contain time derivatives if it is to be 
considered a dynamical theory. Clearly, if the rate of change of the dynamical 
variables with time is zero, nothing ever happens in the system, and the most 
one can do is to discuss steady state properties. 


2 


The electromagnetic field 


Classical electrodynamics serves both as a point of reference and as the point 
of departure for the development of covariant field theories of matter and 
radiation. It was the observation that Maxwell’s equations predict a universal 
speed of light in vacuo which led to the special theory of relativity, and 
this, in turn, led to the importance of perspective in identifying generally 
applicable physical laws. It was realized that the symmetries of special relativity 
meant that electromagnetism could be reformulated in a compact form, using 
a vector notation for spacetime unified into a single parameter space. The 
story of covariant fields therefore begins with Maxwell’s four equations for the 
electromagnetic field in 3 + 1 dimensions. 


2.1 Maxwell’s equations 


In their familiar form, Maxwell’s equations are written (in SI units) 


V-E=— (2.1a) 
€0 
> 3B 
VxE=-— (2.1b) 
ðt 
V-B=0 (2.1c) 
a dE 
ce (V x B) = Tople (2.1d) 
€0 ot 


Pe is the charge density, J is the electric current density and c? = (€ouo)™! is 
the speed of light in a vacuum squared. These are valid, as they stand, in inertial 
frames in flat (3+1) dimensional spacetimes. The study of covariant field theory 
begins by assuming that these equations are true, in the sense that any physical 
laws are ‘true’ — i.e. that they provide a suitably idealized description of the 
physics of electromagnetism. We shall not attempt to follow the path which 
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led to their discovery, nor explore their limitations. Rather, we are interested 
in summarizing their form and substance, and in identifying symmetries which 
allow them to be expressed in an optimally simple form. In this way, we hope 
to learn something deeper about their meaning, and facilitate their application. 


2.1.1 Potentials 


This chapter may be viewed as a demonstration of how applied covariance leads 
to a maximally simple formulation of Maxwell’s equations. A more complete 
understanding of electromagnetic covariance is only possible after dealing with 
the intricacies of chapter 9, which discusses the symmetry of spacetime. Here, 
the aim is to build an algorithmic understanding, in order to gain a familiarity 
with key concepts for later clarification. 

In texts on electromagnetism, Maxwell’s equations are solved for a number 
of problems by introducing the idea of the vector and scalar potentials. The po- 
tentials play an important role in modern electrodynamics, and are a convenient 
starting point for introducing covariance. 

The electromagnetic potentials are introduced by making use of two theo- 
rems, which allow Maxwell’s equations to be re-written in a simplified form. In 
a covariant formulation, one starts with these and adds the idea of a unified 
spacetime. Spacetime is the description of space and time which treats the 
apparently different parameters x and t in a symmetrical way. It does not claim 
that they are equivalent, but only that they may be treated together, since both 
describe different aspects of the extent of a system. The procedure allows us to 
discover a simplicity in electromagnetism which is not obvious in eqns. (2.1). 

The first theorem states that the vanishing divergence of a vector implies that 
it may be written as the curl of some other vector quantity A: 


V-v=0 > v=VxA. (2.2) 


The second theorem asserts that the vanishing of the curl of a vector implies that 
it may be written as the gradient of some scalar ¢: 


Vxv=0 => v=V6¢. (2.3) 


The deeper reason for both these theorems, which will manifest itself later, is 
that the curl has an anti-symmetric property. The theorems, as stated, are true 
in a homogeneous, isotropic, flat space, i.e. in a system which does not have 
irregularities, but they can be generalized to any kind of space. From these, one 
defines two potentials: a vector potential A; and a scalar ¢, which are auxiliary 
functions (fields) of space and time. 

The physical electromagnetic field is the derivative of the potentials. From 
eqn. (2.1c), one defines 


B=VxA. (2.4) 
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automatically and completely solved by re-parametrizing the problem in terms 
of a new variable. Eqn. (2.1c) tells us now that 


> 0 > 
V xE=-——(V x A) 
ot 


> JA 
Vx {E+ — }]=0. (2.5) 
Ot 
Consequently, according to the second theorem, one can write 
JA > 
E+- =-Vọ, (2.6) 
Ot 
giving 
> JA 
E = —-V¢ —- —. (2.7) 


The minus sign on the right hand side of eqn. (2.6) is the convention which is 
used to make attractive forces positive and repulsive forces negative. 

Introducing potentials in this way is not a necessity: many problems in 
electromagnetism can be treated by solving eqns. (2.1) directly, but the intro- 
duction often leads to significant simplifications when it is easier to solve for 
the potentials than it is to solve for the fields. 

The potentials themselves are a mixed blessing: on the one hand, the 
re-parametrization leads to a number of helpful insights about Maxwell’s equa- 
tions. In particular, it reveals symmetries, such as the gauge symmetry, which 
we shall explore in detail later. It also allows us to write the matter—radiation 
interaction in a local form which would otherwise be impossible. The price one 
pays for these benefits is the extra conceptual layers associated with the potential 
and its gauge invariance. This confuses several issues and forces us to deal with 
constraints, or conditions, which uniquely define the potentials. 


2.1.2 Gauge invariance 


Gauge invariance is a symmetry which expresses the freedom to re-define the 
potentials arbitrarily without changing their physical significance. In view of 
the theorems above, the fields E and B are invariant under the re-definitions 


p> p =p- (2.8) 


These re-definitions are called gauge transformations, and s(x) is an arbitrary 
scalar function. The transformation means that, when the potentials are used 
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as variables to solve Maxwell’s equations, the parametrization of physics is not 
unique. Another way of saying this is that there is a freedom to choose between 
one of many different values of the potentials, each of which leads to the same 
values for the physical fields E and B. One may therefore choose whichever 
potential makes the solution easiest. This is a curious development. Why make a 
definite problem arbitrary? Indeed, this freedom can cause problems if one is not 
cautious. However, the arbitrariness is unavoidable: it is deeply connected with 
the symmetries of spacetime (the Lorentz group). Occasionally gauge invariance 
leads to helpful, if abstract, insights into the structure of the field theory. At other 
times, it is desirable to eliminate the fictitious freedom it confers by introducing 
an auxiliary condition which pins down a single ¢, A pair for each value of 
E, B. As long as one uses a potential as a tool to solve Maxwell’s equations, 
it is necessary to deal with gauge invariance and the multiplicity of equivalent 
solutions which it implies. 


2.1.3 4-vectors and (n + 1)-vectors 


Putting the conceptual baggage of gauge invariance aside for a moment, one 
proceeds to make Maxwell’s equations covariant by combining space and time 
in a unified vector formulation. This is easily done by looking at the equations 
of motion for the potentials. The equations of motion for the vector potentials 
are found as follows: first, substituting for the electric field in eqn. (2.1a) using 
eqn. (2.7), one has 


ð e 
-V° — —(V - A) = Pe (2.9) 
ot €0 
Similarly, using eqn. (2.4) in (2.1d), one obtains 
> > 0 =. JA 
Pe ee ag). (2.10) 
€0 ot ot 
Using the vector identity 
Vx(VxA)=VW(V-A)—-V7A (2.11) 
to simplify this, one obtains 
1 3? j -> (dao > 
2 2 2 
=- V ]A=>-V|— V-A)). 2.12 
c (sm ) a (= +c( ) (2.12) 


It is already apparent from eqns. (2.8) that the potentials ø, A are not unique. 
This fact can now be used to tidy up eqn. (2.12), by making a choice for @ and 
A: 


? Lap 


V-A+ —— =0. 2.13 
Pa ot ( ) 
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The right hand side of eqn. (2.13) is chosen to be zero, but, of course, any 
constant would do. Making this arbitrary (but not random) choice, is called 
choosing a gauge. It partially fixes the freedom to choose the scalar field s in 
eqns. (2.8). Specifically, eqn. (2.13) is called the Lorentz gauge. This common 
choice is primarily used to tidy up the equations of motion, but, as noted above, 
at some point one has to make a choice anyway so that a single pair of vector 
potentials (scalar, vector) corresponds to only one pair of physical fields (E, B). 

The freedom to choose the potentials is not entirely fixed by the adoption of 
the Lorentz condition, however, as we may see by substituting eqn. (2.8) into 
eqn. (2.13). Eqn. (2.13) is not completely satisfied; instead, one obtains a new 


condition 
2 ae 


A second condition is required in general to eliminate all of the freedom in the 
vector potentials. 

General covariance is now within reach. The symmetry with which space and 
time, and also ¢ and A, enter into these equations leads us to define spacetime 
vectors and derivatives: 


1 -> 
dy = (<a. v) (2.15) 
yea ( %7 (2.16) 
x J’ j 
with Greek indices u, v = 0,...,n and x? = ct. Repeated indices are summed 


according to the usual Einstein summation convention, and we define! 


1 
= 04, = -50 + V’. (2.17) 
(6 


In n space dimensions and one time dimension (n = 3 normally), the (n + 1) 
dimensional vector potential is defined by 


_ [ /e 
At = ( x ji (2.18) 


Using these (n + 1) dimensional quantities, it is now possible to re-write 
eqn. (2.12) in an extremely beautiful and fully covariant form. First, one 
re-writes eqn. (2.10) as 

J 


C?Eo 


A= 


— Vd, A". (2.19) 


l In some old texts, authors wrote (J? for the same operator, since it is really a four-sided 
(four-dimensional) version of v2. 
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Next, one substitutes the gauge condition eqn. (2.13) into eqn. (2.9), giving 


_ Pe 
Eo. 


od (2.20) 


Finally, the (n + 1) dimensional current is defined by 


_ (cpe 
H= ). (2.21) 


and we end up with the (n + 1) dimensional field equation 


-0 A! = pod", (2.22) 


where c? = (uoco)! has been used. The potential is still subject to the 
constraint in eqn. (2.13), which now appears as 


d,A" = 0. (2.23) 


2.1.4 The field strength 


The new attention given to the potential A, should not distract from the main 
aim of electromagnetism: namely to solve Maxwell’s equations for the electric 
and magnetic fields. These two physical components also have a covariant 
formulation; they are now elegantly unified as the components of a rank 2 tensor 
which is denoted F,,, and is defined by 


Fry = OyAy — Ag; (2.24) 
the tensor is anti-symmetric 
| = =F (2.25) 


This anti-symmetry, which was alluded to earlier, is the reason for the gauge 
invariance. The form of eqn. (2.24) is like a (3 + 1) dimensional curl, 
expressed in index notation. The explicit components of this field tensor are 
the components of the electric and magnetic field components, in a Cartesian 
basis E = (EF, E>, E3), etc.: 


0 —E\/c —E,/c —E3/c 


_ E\/c 0 B3 — B> 
Fuv E Ez/c — B3 (0) Bı oe? 
E3/c B2 — B; (0) 


In chapter 9, it will be possible to provide a complete understanding of how 
the symmetries of spacetime provide an explanation for why the electric and 
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magnetic components of this field appear to be separate entities, in a fixed 
reference frame. 

With the help of the potentials, three of Maxwell’s equations (eqns. (2.1a,c,d)) 
are now expressed in covariant form. Eqn. (2.1c) is solved implicitly by the 
vector potential. The final equation (and also eqn. (2.1c), had one not used the 
vector potential) is an algebraic identity, called the Jacobi or Bianchi identity. 
Moreover, the fact that it is an identity is only clear when we write the equations 
in covariant form. The final equation can be written 


Eee 9 Foa = 0, (2.27) 


where €“”*” is the completely anti-symmetric tensor in four dimensions, defined 
by its components in a Cartesian basis: 


+1 wvap = 0123 and even permutations 
ee — ¥ _1 wvap = 0132 and other odd permutations (2.28) 
0 otherwise. 


This equation is not a condition on F,,,, in spite of appearances. The anti- 
symmetry of both «“”’? and F,,, implies that the expansion of eqn. (2.27), 
in terms of components, includes many terms of the form (0,0, — 0,0,)Aj, 
the sum of which vanishes, provided A, contains no singularities. Since the 
vector potential is a continuous function in all physical systems,” the truth of the 
identity is not in question here. 

The proof that this identity results in the two remaining Maxwell’s equations 
applies only in 3 + 1 dimensions. In other numbers of dimensions the equations 
must be modified. We shall not give it here, since it is easiest to derive using the 
index notation and we shall later re-derive our entire formalism consistently in 
that framework. 


2.1.5 Covariant field equations using Fy 


The vector potential has been used thus far, because it was easier to identify the 
structure of the (3 + 1) dimensional vectors than to guess the form of F“”, but 
one can now go back and re-express the equations of motion in terms of the 
so-called physical fields, or field strength F,,,. The arbitrary choice of gauge in 
eqn. (2.22) is then eliminated. 

Returning to eqn. (2.9) and adding and subtracting 03, one obtains 


=p 20h a=. (2.29) 
€0 


2 The field strength can never change by more than a step function, because of Gauss’ law: the 
field is caused by charges, and a point charge (delta function) is the most singular charge that 
exists physically. This ensures the continuity of A,,. 
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Adding this to eqn. (2.19) (without choosing a value for 0,A”), one has 


ju 
-0 At = aa ə” (3p A”). (2.30) 


Taking the last term on the right hand side over to the left and using eqn. (2.17) 
yields 
u 


J 
3 (0H A” — 3” AH) = —_. (2.31) 
C^E0 


The parenthesis on the left hand side is now readily identified as 
DFH” = uoJ". (2.32) 


This is the covariant form of the field equations for the physical fields. It 
incorporates two of the four Maxwell equations as before (eqn. (2.1c) is implicit 
in the structure we have set up). The final eqn. (2.27) is already expressed in 
terms of the physical field strength, so no more attention is required. 


2.1.6 Two invariants 


There are two invariant, scalar quantities (no free indices) which can be written 
down using the physical fields in (3 + 1) dimensions. They are 


F =F Fiv (2.33) 
G = eP Fuy Pro. (2.34) 
The first of these evaluates to 
2 l2 
F=2|B- = : (2.35) 
c 


In chapter 4 this quantity is used to construct the action of the system: a 
generating function the dynamical behaviour. The latter gives 


G=E-B. (2.36) 


In four dimensions, this last quantity vanishes for a self-consistent field: the 
electric and magnetic components of a field (resulting from the same source) 
are always perpendicular. In other numbers of dimensions the analogue of this 
invariant does not necessarily vanish. 

The quantity F has a special significance. It turns out to be a Lagrangian, 
or generating functional, for the electromagnetic field. It is also related to the 
energy density of the field by a simple transformation. 
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2.1.7 Gauge invariance and physical momentum 


As shown, Maxwell’s equations and the physical field F,,, are invariant under 
gauge transformations of the form 


Ay > Ay + (0,58). (2.37) 


It turns out that, when considering the interaction of the electromagnetic field 
with matter, the dynamical variables for matter have to change under this gauge 
transformation in order to uphold the invariance of the field equations. 

First, consider classical particles interacting with an electromagnetic field. 
The force experienced by classical particles with charge q and velocity v is the 
Lorentz force 


The total force for an electron in an external potential V and an electromagnetic 
field is therefore 
dpi 
dt 


Expressing E and B in terms of the vector potential, we have 


= —e(E; + EijkY j Bk) = di V. (2.39) 


0; (pi = eA;) = —eF;ijžj = 0;(V + eA;). (2.40) 


This indicates that, apart from a gauge-invariant Biot—Savart contribution in the 
first term on the right hand side of this equation, the electromagnetic interaction 
is achieved by replacing the momentum p; and the energy E by 


Pu (Pu — eAy)- (2.41) 


The Biot-Savart term can also be accounted for in this way if we go over to a 
relativistic, Lorentz-covariant form of the equations: 


d 
Je PH —eA,) + Fl? =0, (2.42) 


where J“ = —edx"/dt ~ Idl is the current in a length of wire dx (with 
dimensions current x length) and t is the proper time. In terms of the more 
familiar current density, we have 


d 
(Pn = ey) + f do Pik =, (2.43) 


We can now investigate what happens under a gauge transformation. Clearly, 
these equations of motion can only be invariant if p, also transforms so as to 
cancel the term, „s, in eqn. (2.37). We must have in addition 


Pu Put eds. (2.44) 
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Without a deeper appreciation of symmetry, this transformation is hard to under- 
stand. Arising here in a classical context, where symmetry is not emphasized, 
it seems unfamiliar. What is remarkable, however, is that the group theoretical 
notions of quantum theory of matter makes the transformation very clear. The 
reason is that the state of a quantum mechanical system is formulated very 
conveniently as a vector in a group theoretical vector space. Classically, po- 
sitions and momenta are not given a state-space representation. In the quantum 
formulation, gauge invariance is a simple consequence of the invariance of 
the equations of motion under changes of the arbitrary complex phase of the 
quantum state or wavefunction. 

In covariant vector language, the field equations are invariant under a re- 
definition of the vector potential by 


Ay > Ay + (us), (2.45) 


where s(x) is any scalar field. This symmetry is not only a mathematical 
curiosity; it also has a physical significance, which has to do with conservation. 


2.1.8 Wave solutions to Maxwell’s equations 


The equation for harmonic waves W (x), travelling with speed v, is given by 


eh oe 


If the speed of the waves is v = c, this may be written in the compact form 


-0O W(x) =0. (2.47) 


It should already be apparent from eqn. (2.22) that Maxwell’s equations have 
wavelike solutions which travel at the speed of light. Writing eqn. (2.22) in 
terms of the field strength tensor, we have 


—O Fay = Holp Jo — Jy). (2.48) 


In the absence of electric charge J,, = 0, the solutions are free harmonic waves. 
When J, # 0, Maxwell’s equations may be thought of as the equations of 
forced oscillations, but this does not necessarily imply that all the solutions 
of Maxwell’s equations are wavelike. The Fourier theorem implies that any 
function may be represented by a suitable linear super-position of waves. This is 
understood by noting that the source in eqn. (2.48) is the spacetime ‘curl’ of the 
current, which involves an extra derivative. Eqn. (2.32) is a more useful starting 
point for solving the equations for many sources. The free wave solutions for 
the field are linear combinations of plane waves with constant coefficients: 


Ay(k) = Cy exp(ik,, x"). (2.49) 
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By substituting this form into the equation 


OA, =0, (2.50) 


one obtains a so-called dispersion relation for the field: 
kk" =K =k? — w/c? = 0. (2.51) 


This equation may be thought of as a constraint on the allowed values of k. The 
total field may be written compactly in the form 


A,x) = J ak eux" A (K) 8k (2.52) 
u = (2m)"+! K ; i 


where A(k),, represents the amplitude of the wave with wavenumber k;, and 
the vector index specifies the polarization of the wave modes. From the gauge 
condition in eqn. (2.23), we have 


kA)" = 0. (2.53) 


The delta-function constraint in eqn. (2.52) ensures that the combination of 
waves satisfies the dispersion relation in eqn. (2.51). If we use the property 
of the delta function expressed in Appendix A, eqn. (A.15), then eqn. (2.52) 
may be written 


atk ME 
A,x) = e | ellis!) A(k) — ( 2) 


(27m )”+! ck; \ðki 
x (5¢ko — vK) + 8(ky + vi?) (2.54) 


The delta functions ensure that the complex exponentials are waves travelling at 
the so-called phase velocity 


vi, =o (2.55) 
where w and k; satisfy the dispersion relation. The amplitude of the wave clearly 
changes at the rate 


do 

eee 2.56 
Vor Ok; ( ) 
known as the group velocity. By choosing the coefficient C(k) for each 
frequency and wavelength, the super-position principle may be used to build 
up any complementary (steady state) solution to the Maxwell field. We shall use 
this approach for other fields in chapter 5. 
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2.2 Conservation laws 


The simple observation of ‘what goes in must come out’ applies to many 
physical phenomena, including electromagnetism, and forms a predictive frame- 
work with rich consequences. Conservation is a physical fact, which must be 
reflected in dynamical models. Just as economics uses money as a book-keeping 
parameter for transactions, so physics accounts for transactions (interactions) 
with energy, charge and a variety of similarly contrived labels which have proven 
useful in keeping track of ‘stock’ in the physical world. 


2.2.1 Current conservation 


Perhaps the central axiom of electromagnetism is the conservation of total 
electric charge. An algebraic formulation of this hypothesis provides an 
important relationship, which will be referred to many times. Consider the 
electric current J, defined in terms of the rate of flow of charge: 


I J do - J dg (2.57) 
dt 
Expressing the charge Q as the integral over the charge density, one has 
fy - Jdo = —9, J pedo. (2.58) 
Comparing the integrand on the left and right hand sides gives 
ðpe > 
Pei ¥.J=0, (2.59) 
ot 
or, in index notation, 
3i Jİ = 0; pe. (2.60) 


This may now be expressed in 4-vector language (or (n + 1)-vector language), 
and the result is: 


ðJ” =0. (2.61) 


This result is known as a continuity condition or a conservation law. All 
conservation laws have this essential form, for some (n + 1) dimensional current 
vector J”. The current is then called a conserved current. In electromagnetism 
one can verify that the conservation of current is compatible with the equations 
of motion very simply in eqn. (2.32) by taking the 4-divergence of the equation: 


ð ð FH” = uo 3p J” = 0. (2.62) 


The fact that the left hand size is zero follows directly from the anti-symmetrical 
and non-singular nature of Fy». 
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2.2.2 Poynting’s vector and energy conservation 


The electromagnetic field satisfies a continuity relation of its own. This relation 
is arrived at by considering the energy flowing though a unit volume of the field. 
The quantities defined below will later re-emerge in a more general form as the 
so-called field theoretical energy-momentum tensor. 

The passage of energy through an electromagnetic system may be split up into 
two contributions. The first is the work done on any electric charges contained 
in the volume. This may be expressed in terms of the current density and the 
electric field as follows. The rate at which work is done on a moving charge is 
given by the force vector dotted with the rate of change of the displacement (i.e. 
the velocity), F - v. The force, in turn, is given by the charge multiplied by the 
electric field strength gE, which we may write in terms of the charge density pe 
inside a spatial volume do as peEdo. The rate at which work is done on charges 
may now be expressed in terms of an external source or current, by identifying 
the external current to be the density of charge which is flowing out of some 
volume of space with a velocity v 


Jext = PeV. (2.63) 
We have 
Rate of work = E- Jextdo. (2.64) 


The second contribution to the energy loss from a unit volume is due to the 
flux of radiation passing through the surface (S) which bounds the infinitesimal 
volume (øo). This flux of radiation is presently unknown, so we shall refer to it 
as S. If we call the total energy density H, then we may write that the loss of 
energy from an infinitesimal volume is the sum of these two contributions: 


=g, [ Heo = [8-084 |B doudo: (2.65) 
o S o 


In 1884, Poynting identified H and S using Maxwell ’s equations. We shall now 
do the same. The aim is to eliminate the current Jext from this relation in order 
to express H and S in terms of E and B only. We begin by using the fourth 
Maxwell equation (2.1d) to replace Jext in eqn. (2.65): 

E. (V xB) 


E- Jext = —— — é E - 0, E. (2.66) 
Ho 


Using the vector identity in Appendix A, eqn. (A.71), we may write 
E-(VxB)=V-(BxE)+B.-(V xE). (2.67) 
The second Maxwell eqn. (2.1b) may now be used to replace V x E, giving 


E. (V x B) = V . (B x E) — B9,B. (2.68) 
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Finally, noting that 
1 
z’ (X - X) = X9,X, (2.69) 


and using this with X = E and X = B in eqns. (2.66) and (2.68), we may write: 


V-(BxE 1 1 
~ <9, (eoE-E+—B-B). (2.70) 
Ho 2 Ho 


E -Jext = 


This equation has precisely the form of eqn. (2.65), and the pieces can now be 
identified: 


0 1 1 
S =H=-|eE-E+ —B-B 
Ho 


2 
1 
T (2.71) 
: ExB 
AY = S = 
CHo 
ExH 
= “ (2.72) 
c 


The new fields D = €oE and oH = B have been defined. The energy density H 
is often referred to as a Hamiltonian for the free electromagnetic field, whereas 
S is referred to as the Poynting vector. 


Iu S = (Fop Jia) (2.73) 


is the rate at which work is done by an infinitesimal volume of the field. It 
is clear from the appearance of an explicit zero component in the above that 
this argument cannot be the whole story. One expects a generally covariant 
expression. The expression turns out to be 


Irog = Fda (2.74) 


where 0,» is the energy-momentum tensor. Notice how it is a surface integral 
which tells us about flows in and out of a volume of space. One meets this idea 
several times, in connection with boundary conditions and continuity. 


2.3 Electromagnetism in matter 


To describe the effect of matter on the electromagnetic field in a covariant way, 
one may use either a microscopic picture of the field interacting with matter at 
the molecular level, or a macroscopic, effective field theory, which hides the 
details of these interactions by defining equivalent fields. 
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Fig. 2.1. Matter is not electrically neutral at the microscopic level. 


2.3.1 Dielectrics 


One tends to think of ordinary matter as being electrically neutral, but of course 
it is composed of atoms and molecules, which have a centre of positive charge 
and a centre of negative charge — and these two centres do not necessarily lie at 
the same place. The more symmetrical a molecule is, the more neutral it is: for 
instance, the noble gases have highly symmetrical electron orbits and thus have 
almost no polarizability on average; the water molecule, on the other hand, has 
an asymmetry which allows a trickle of water to be attracted by a charged comb. 

When an electric field is switched on in the vicinity of a dielectric material, the 
centres of positive and negative charge in each molecule are forced apart slightly 
(see figure 2.1) in a substance-dependent way. We say that such a molecule has 
a certain polarizability. 

For classical external fields, atoms and molecules behave like dipoles, i.e. 
there is a clear separation of the charge into two parts: a positive pole and a 
negative pole. But we would be doing a disservice to the radiation field (not to 
mention the quantum theory) if we did not recall that the field has a wave nature 
and a characteristic wavelength. Molecules behave like dipoles if the wavelength 
of the external field is large compared to the size of the molecule — since then 
there is a clear direction to the field at every point inside the molecule’s charge 
cloud. If, on the other hand, the wavelength of the field is of the order of the size 
of the molecule or less, then the field can reverse direction inside the molecule 
itself. The charge then gets re-distributed into a more complex pattern, and so- 
called quadrapole moments and perhaps higher ‘pole moments’ must be taken 
into account. In this text, we shall only consider the dipole approximation. 

The simplest way to model the polarization of an atom or molecule is to view 
it as opposite charges coupled together by a spring. This model is adequate for 
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many materials, provided the external electric field is not too strong. Materials 
which behave like this are called linear media. Initially, the centres of positive 
and negative charge are in the same place, but, as the external field is switched 
on, they are pulled further and further apart, balanced by the restoring force of 
the spring. This separation of charge creates a new local field, which tends to 
cancel the external field inside the molecule, but to reinforce it at the poles. If the 
charge clouds have charge q and the spring constant is « then the force equation 
is simply 


F=-xs = Eq, (2.75) 


where s is the displacement of the charges from one another, in the rest frame 
of the atoms. The separation multiplied by the charge gives the effective 
contribution to the field at the poles of a single molecule, denoted the dipole 
moment d: 

q? 

d = |s| = —E. (2.76) 

K 
The quantity q?/x« is denoted by «œ and is called the polarizability; it denotes the 
effective strength of the resistance to polarization. The polarization field is 


where py is the total number of molecules per unit volume. It is proportional 
to the field of particles displacements s'(x) and it hides some non-covariant 
assumptions (see the next section). Normally speaking, one writes q = —e, 
where —e is the charge on the electron. Then, 


2 


Astatic = 2 (2.78) 
K 
If one considers time-varying fields, or specifically waves of the form 
E = Ep", (2.79) 


it is found that, for a single optically active electron (i.e. one in an orbit which 
can be affected by an external field), the equation of motion is now that of a 
damped harmonic oscillator: 


m(a@p + iyo — w’)s = —eEo, (2.80) 


where œ} = «/m and y is a damping term. Using this equation to replace for s 
in eqn. (2.76), we get 


q°/m 


: 2.81 
(wi + iyo — w?) ee) 


a(w) = 
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Thus the polarizability is a frequency-dependent quantity. This explains why a 
prism can split a mixture of white light into its component frequencies. A further 
definition is of interest, namely the electric susceptibility Xe = Na(@)/€o. For 
Pn particles per unit volume, this is often expressed in terms of the plasma 
frequency œ? = Ne*/m. Thus, 


P = coxeE. (2.82) 


This is closely related to the change in the refractive index, n? = 1 + Xe, Of 
a material due to polarization, when uy = 1 (which it nearly always is). In 
real space, we note from eqn. (2.80) that the polarization satisfies a differential 
equation 


2 
(3) — vð, + oP = = pyE (2.83) 


and thus the real space susceptibility can be thought of as a Green function for 
the differential operator and E may be thought of as a source. 


P(t) = co faxu —t’)E. (2.84) 


x (t — 1’) is taken to satisfy retarded boundary conditions, which, in turn, implies 
that its real and imaginary parts in terms of w are related. The relationship is 
referred to as a Kramers—Kronig relation, and is simply a consequence of the 
fact that a retarded Green function is real. 


2.3.2 Covariance and relative motion: the Doppler effect 


The frequency-dependent expressions above are true only in the rest frame of the 
atoms. The results are not covariant with respect to moving frames of reference. 
When one studies solid state systems, such as glasses and crystals, these 
expressions are quite adequate, because the system has a naturally preferred 
rest frame and the atoms in the material do not move relative to one another, on 
average. However, in gaseous matter or plasmas, this is not the case: the thermal 
motion of atoms relative to one another can be important, because of the Doppler 
effect. This fact can be utilized to good effect; for example, in laser cooling the 
motion of atoms relative to a laser radiation field can be used to bring all of the 
atoms into a common rest frame by the use of a resonant, frequency-dependent 
interaction. A Galilean-covariant expression can be written by treating the field 
as one co-moving mass, or as a linear super-position of co-moving masses. With 
only one component co-moving, the transformation of the position of an atom 
in the displacement field can be written 


x(t) > x+ vt, (2.85) 
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where v is the velocity of the motion relative to some reference frame (usually 
the container of the gaseous matter, or the laboratory frame). This means that 
the invariant form (kx — wt) is transformed into 


k-x—ot >k. (x+ vt) -—ot =k-x— opt, (2.86) 
where 
op = o(1—k- B) =w — k- v/c). (2.87) 


Thus, the expressions above can be used, on replacing œw with a sum over all wg, 
and integrated over all the values of the velocity vector BÍ of which the field is 
composed. The polarizability takes the form 


Ce a (2.88) 
ae (a + iyo — 03) ` 
where 
op = (1 — k Bio. (2.89) 


2.3.3 Refractive index in matter 


It appears that the introduction of a medium destroys the spacetime covariance 
of the equations of motion. In fact this is false. What is interesting is that 
covariance can be restored by re-defining the (n + 1) dimensional vectors so as 
to replace the speed of light in a vacuum with the effective speed of light in a 
medium. The speed of light in a dielectric medium is 


v= (2.90) 
n 
where n = €H > | is the refractive index of the medium. 

Before showing that covariance can be restored, one may consider the 
equation of motion for waves in a dielectric medium from two perspectives. 
The purpose is to relate the multifarious fields to the refractive index itself. It is 
also to demonstrate that the polarization terms can themselves be identified as 
currents which are set in motion by the electric field. In other words, we will 
show the equivalence of (i) P Æ 0, but J, = 0, and (ii) P = 0 with J, given by 
the current resulting from charges on springs! Taking 

L 
J? = cpe J’ = E (2.91) 
dt 
the current is seen to be a result of the net charge density set in motion by the 
field. This particular derivation applies only in 3 + 1 dimensions. 
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To obtain the wave equation we begin with the Bianchi identity 
€ijkðjEk + 0,B; = 0, (2.92) 


and then operate from the left with €;;,,.0;. Using the identity (A.38) (see 
Appendix A) for the product of two anti-symmetric tensors, we obtain 


[V7 E; — 0:(0/ E;)] + €im9d, B; = 0. (2.93) 


Taking 0, of the fourth Maxwell equation, one obtains 


2 


I 3E 
—— Eijk jð Be = 9; Ji + E06; (2.94) 
Mohr 


ee 
at ` 


These two equations can be used to eliminate B;, giving an equation purely 
in terms of the electric field. Choosing the charge distribution to be isotropic 
(uniform in space), we have 0; pe = 0, and thus 


vr — 53 E; = Uolrd; Ji. (2.95) 


In this last step, we used the definition of the refractive index in terms of €;: 
n? = ly = (1 + Xe)Mr- (2.96) 


This result is already suggestive of the fact that Maxwell’s equations in a 
medium can be written in terms of an effective speed of light. 
We may now consider the two cases: (i) P Æ 0, but J,, = 0, 


2 n? 9? 
V4 — 2 ap E; =0; (2.97) 
and (ii) P = 0 (n = 1), J, 4 0. 
> 1 3? —pnerw’/m- E; 
Vere 2a | hi = Mohra Oy" 
c“ ot (œw + iyw — ow?) 


(2.98) 


The differential operators on the left hand side can be replaced by k? and w”, by 
using the wave solution (2.79) for the electric field to give a ‘dispersion relation’ 
for the field. This gives: 


k? 1 (1 Lr  pne?a?/m ) 
w e €o (@ + iyw — œ?) 


2 
= = (2.99) 
Cc 
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So, from this slightly cumbersome expression for the motion of charges, one 
derives the microscopic form of the refractive index. In fact, comparing 
eqns. (2.99) and (2.98), one sees that 


PNA(@) [Lr 
€0 f 


n?=1+ (2.100) 
Since u, is very nearly unity in all materials that waves penetrate, it is common 
to ignore this and write 


n? ~1t Xe. (2.101) 


The refractive index is a vector in general, since a material could have a 
different index of refraction in different directions. Such materials are said to be 
anisotropic. One now has both microscopic and macroscopic descriptions for 
the interaction of radiation with matter, and it is therefore possible to pick and 
choose how one wishes to represent this physical system. The advantage of the 
microscopic formulation is that it can easily be replaced by a quantum theory 
at a later stage. The advantage of the macroscopic field description is that it is 
clear why the form of Maxwell’s equations is unaltered by the specific details of 
the microscopic interactions. 


2.4 Aharonov-Bohm effect 


The physical significance of the vector potential A, (as opposed to the field 
Fav) was moot prior to the arrival of quantum mechanics. For many, the 
vector potential was merely an artifice, useful in the computation of certain 
boundary value problems. The formulation of quantum mechanics as a local 
field theory established the vector potential as the fundamental local field, and 
the subsequent attention to gauge symmetry fuelled pivotal developments in the 
world of particle physics. Today, it is understood that there is no fundamental 
difference between treating the basic electromagnetic interaction as a rank 2 
anti-symmetric tensor F», or as a vector with the additional requirement of 
gauge invariance. They are equivalent representations of the problem. In 
practice, however, the vector potential is the easier field to work with, since 
it couples locally. The price one pays lies in ensuring that gauge invariance is 
maintained (see chapter 9). 

The view of the vector potential as a mathematical construct was shaken by 
the discovery of the Aharonov-Bohm effect. This was demonstrated is a classic 
experiment of electron interference through a double slit, in which electrons are 
made to pass through an area of space in which A, ¢ 0 but where F,,, = 0. 
The fact that a change in the electron interference pattern was produced by this 
configuration was seen as direct evidence for the physical reality of A,,. Let us 
examine this phenomenon. 
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Fig. 2.2. The Aharonov-Bohm experiment. 


The physical layout of the double-slit experiment is shown in figure 2.2. 
An electron source fires electrons at the slits, these pass through the slits and 
interfere in the usual way, forming an interference pattern on the screen at the 
end of their path. In order to observe the Aharonov-Bohm effect, one places a 
solenoid on the far side of the slits, whose magnetic field is constrained within 
a cylinder of radius R. The vector potential arising from the solenoid geometry 
is not confined to the inside of the solenoid however. It also extends outside of 
the solenoid, but in such a way as to produce no magnetic field. 

What is remarkable is that, when the solenoid is switched on, the interference 
pattern is shifted by an amount x. This indicates that a phase shift A@ is 
introduced between the radiation from the two slits, and is caused by the 
presence of the solenoid. If the distance L is much greater than x and a then we 
have 


eae | 
L 
L,-L 2 
A0 =2 ( l 2) = te 
à À 
Lx 
x= (=) Ad. (2.102) 
2nd 


The phase difference can be accounted for by the gauge transformation of the 
electron field by the vector potential. Although the absolute value of the vector 
potential is not gauge-invariant, the potential difference between the paths is. 
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The vector potential inside and outside the solenoid position is 


1 
(r<R): Ap=3Br, A,=A,=0 
2 


BR 
(r >R): Ag= ae A, =A, =0. (2.103) 
r 
The magnetic field in the regions is 


PavAs, 


= V, Ag 
= : [ra — EN | 
r |ð, ap ” 
=0 (r< R) 
=B (r >R). (2.104) 


The phase difference can be determined, either from group theory, or from 
quantum mechanics to be 


exp(i0) = exp G; J A'd); (2.105) 
P 


where ‘P’ indicates the integral along a given path. Around the closed loop 
from one slit to the screen and back to the other slit, the phase difference is 
(using Stokes’ theorem) 


Ad = 6; — 2 


e 
=t [xD dS 
=; 
E 
£ = [ Bas. (2.106) 


The phase shift therefore results from the paths having to travel around the 
solenoid, i.e. in a loop where magnetic flux passes through a small part of 
the centre. Note, however, that the flux does not pass through the path of the 
electrons, only the vector potential is non-zero for the straight-line paths. 

There are two ways of expressing this: (i) electrons must be affected by the 
vector potential, since the field is zero for any classical path from the slits to the 
screen; or (ii) electrons are stranger than we think: they seem to be affected by 
a region of space which is classically inaccessible to them. The viewpoints are 
really equivalent, since the vector potential is simply an analytic extension of 
the field strength, but the result is no less amazing. It implies a non-locality in 
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the action of the magnetic field: action at a distance, and not only at a distance, 
but from within a container. If one chooses to believe in the vector potential as 
a fundamental field, the behaviour seems less objectionable: the interaction is 
then local. There is no action at a distance, and what happens inside the solenoid 
is of less interest. 

Whether one chooses to view this as evidence for the physical reality of 
the vector potential or of the strangeness of quantum mechanics is a matter 
of viewpoint. Indeed, the reality of any field is only substantiated by the 
measurable effect it has on experiments. However, there are deeper reasons for 
choosing the interpretation based on the reality of the vector potential, which 
have to do with locality and topology, so at the very least this gives us a new 
respect for the utility of the vector potential. In view of the utility of A,, and its 
direct appearance in dynamical calculations, it seems reasonable to accept it as 
the fundamental field in any problem which is simplified by that assumption. 


3 


Field parameters 


The parameters which measure change in dynamical systems have a unique 
importance: they describe both the layout and the development of a system. 
Space (position) and time are the most familiar parameters, but there are other 
possibilities, such as Fourier modes. 

In the previous chapter, it was seen how the unification of spatial and temporal 
parameters, in electromagnetism, led to a tidier and deeper form of the Maxwell 
equations. It also made the equations easier to transform into other relativistic 
frames. In the covariant approach to physics one is concerned with what 
does and does not change, when shifting from one perspective to another, 
i.e. with the properties of a system which are dependent and independent of 
the circumstances of observation. In a continuous, holonomic system, this is 
summarized by two independent concepts: parameter spaces and coordinates. 


e Parameter space (manifold). This represents the stage for physical 
reality. A parameter space has coordinate-independent properties such 
as topology and curvature. 


e Coordinates. These are arbitrary labels used to mark out a reference 
scheme, or measurement scheme, in parameter space. There is no unique 
way to map out a parameter space, e.g. Cartesian or polar coordinates. 
If there is a special symmetry, calculations are often made easier by 
choosing coordinates which match this symmetry. 


Coordinates are labels which mark a scale on a parameter space. They measure 
a distance in a particular direction from an arbitrary origin. Clearly, there 
is nothing fundamental about coordinates: by changing the arbitrary origin, 
or orientation of measurement, all coordinate labels are changed, but the 
underlying reality is still the same. They may be based on flat Cartesian (x, y, z) 
or polar (r, 0, p) conventions; they can be marked on flat sheets or curved shells. 
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Underneath the details of an arbitrary system of measurement is a physical 
system which owes nothing to those details. 

The invariant properties or symmetries of parameter spaces have many 
implicit consequences for physical systems; not all are immediately intuitive. 
For this reason, it is useful to study these invariant properties in depth, to see 
how they dictate the possibilities of behaviour (see chapter 9). For now it is 
sufficient to define a notation for coordinates on the most important parameter 
spaces. 

This chapter summarizes the formulation of (n + 1) dimensional vectors in 
Minkowski spacetime and in its complementary space of wavevectors k, usually 
called momentum space or reciprocal lattice space. 


3.1 Choice of parametrization 


The dynamical variables, in field theory, are the fields themselves. They are 
functions of the parameters which map out the background space or spacetime; 


e.g. 
W(t), P(x), xt, r, 0, $). (3.1) 


Field variables are normally written as functions of spacetime positions, but 
other decompositions of the field are also useful. Another ubiquitous choice 
is to use a complementary set of variables based upon a decomposition of the 
field into a set of basis functions, a so-called spectral decomposition. Given 
a complete set of functions w;(x), one can always write an arbitrary field as a 
linear super-position: 


(x) = D> ci Wis). (3.2) 


i 


Since the functions are fixed and known, a knowledge of the coefficients c; in 
this decomposition is equivalent to a knowledge of ¢ (x), i.e. as a function of x. 
However, the function may also be written in a different parametrization: 


(C1, C2, C3...). (3.3) 


This is a shorthand for the decomposition above, just as (x) is a shorthand for 
a polynomial or series in x. Usually, an infinite number of such coefficients is 
needed to prescribe a complete decomposition of the field, as, for instance, in 
the Fourier expansion of a function, described below. 

Spacetime is an obvious parameter space for a field theory since it comprises 
the world around us and it includes laboratories where experiments take place, 
but other basis functions sometimes reveal simpler descriptions. One important 
example is the complementary Fourier transform of spacetime. The Fourier 
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transform is important in situations where one suspects a translationally invari- 
ant, homogeneous system. The Fourier transform of a function of x is defined 
to be a new function of the wavenumber k (and the inverse transform) by the 
relations: 


dk . 
fa) = J FO 

IU 
fi) = fo e f(x). (3.4) 


k is a continuous label on a continuous set of functions exp(ikx), not a discrete 
set of c;, for integer i. In solid state physics, the space parametrized by k is 
called the reciprocal lattice space. Fourier transform variables are useful for 
many purposes, such as revealing hidden periodicities in a function, since the 
expansion is based on periodic functions. The Fourier transform is also a useful 
calculational aid. 

Spacetime (configuration space) and the Fourier transform are two com- 
plementary ways of describing the basic evolution of most systems. These 
two viewpoints have advantages and disadvantages. For example, imagine a 
two-state system whose behaviour in time can be drawn as a square wave. To 
represent a square wave in Fourier space, one requires either an infinite number 
of Fourier waves of different frequencies, or merely two positions over time. In 
that case, it would be cumbersome to use a Fourier representation of the time 
evolution. 


3.2 Configuration space 


The four-dimensional vectors used to re-write electromagnetism are easily 
generalized to (n+ 1) spacetime dimensions, for any positive n. They place time 
and space on an almost equal footing. In spite of the notational convenience of 
unified spacetime, some caution is required in interpreting the step. Time is not 
the same as space: formally, it distinguishes itself by a sign in the metric tensor; 
physically, it plays a special role in determining the dynamics of a system. 


3.2.1 Flat and curved space 


Physical systems in constrained geometries, such as on curved surfaces, or 
within containers, are best described using curvilinear coordinates. Experi- 
mental apparatus is often spherical or toroidal; shapes with a simple symmetry 
are commonly used when generating electromagnetic fields; rectangular fields 
with sharp corners are less common, since these require much higher energy to 
sustain. 

Studies of what happens within the volumes of containers, and what happens 
on their surface boundaries, are important in many situations [121]. When 


3.2 Configuration space 35 


generalizing, to study systems in (n + 1) dimensions, the idea of surfaces and 
volumes also has to be generalized. The distinction becomes mainly one of 
convenience: (n + 1) dimensional curved surfaces are curved spacetimes. The 
fact that they enclose a volume or partition a space which is (n + 2) dimensional 
is not always germane to the discussion at hand. This is particularly true in 
cosmology. 

It is important to distinguish between curvilinear coordinates in flat space 
and coordinate labellings of curved space. An example of the former is the 
use of polar (r, 9) coordinates to map out a plane. The plane is flat, but the 
coordinates span the space in a set of curved rings. An example of the latter 
is (6, @) coordinates (at fixed r), mapping out the surface of a sphere. Over 
very short distances, (6, @) can be likened to a tiny planar patch with Cartesian 
coordinates (x, y). 

Einstein’s contribution to the theory of gravity was to show that the laws of 
gravitation could be considered as an intrinsic curvature of a (3+ 1) dimensional 
spacetime. Einstein used the idea of covariance to argue that one could view 
gravity in one of two equivalent ways: as forced motion in a flat spacetime, 
or as free-fall in a curved spacetime. Using coordinates and metric tensors, 
gravitation could itself be described as a field theory, in which the field g,,(x) 
was the shape of spacetime itself. 

Gravitational effects may be built into a covariant formalism to ensure that 
every expression is general enough to be cast into an arbitrary scheme of 
coordinates. If one allows for general coordinates (i.e. general covariance), 
one does not assume that all coordinates are orthogonal Cartesian systems, and 
gravity and curvature are not excluded from the discussion. 

Spacetime curvature will not be treated in detail here, since this topic is widely 
discussed in books on relativity. However, we take the issue of curvature “under 
advisement’ and construct a formalism for dealing with arbitrary coordinates, 
assured that the results will transform correctly even in a curved environment. 


3.2.2 Vector equations 


Vector methods express spatial relationships, which remain true regardless of 
the system of coordinates used to write them down. They thus play a central 
role in covariant formulation. For example, the simple vector equation 


A-B=0 (3.5) 


expresses the fact that two vectors A and B are orthogonal. It says nothing about 
the orientation of the vectors relative to a coordinate system, nor their position 
relative to an origin; rather, it expresses a relationship of more intrinsic value 
between the vectors: their relative orientation. Vector equations and covariance 
are natural partners. 
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Vector equations are form-invariant under changes of coordinates, but the 
details of their components do change. For instance, in the above equation, 
if one fixes a coordinate system, then the components of the two vectors take on 
definite values. If one then rotates or translates the coordinates, the values of the 
components change, but the equation itself remains true. 


3.2.3 Coordinate bases 


A coordinate basis is a set of (n + 1) linearly independent reference vectors 
e,,, used to provide a concise description of any vector within a vector space. 
They are “standard arrows’; without them, every direction would need to have a 
different name.! 

In index notation, the components of a vector a are written, relative to a basis 
or set of axes e;, as {a'}, i.e. 


a= > a" e, =a" ey. (3.6) 


H 


Note that, as usual, there is an implied summation convention over repeated 
indices throughout this book. The subscript u runs over the number of 
dimensions of the space. 

Linearity is a central concept in vector descriptions. One does not require 
what happens within the space to be linear, but the basis vectors must be locally 
linear in order for the vector description to be single-valued. Consider, then, 
the set of all linear scalar functions of vectors. Linearity implies that a linear 
combination of arguments leads to a linear combination of the functions: 


œw(cÃe,) = cw (e). (3.7) 


Also, the linear combination of different functions results in new linear func- 
tions: 


o'v) => cuo). (3.8) 
u 


The space of these functions is therefore also a vector space V*, called the dual 
space. It has the same dimension as the vector space (also called the tangent 
space). The duality refers to the fact that one may consider the 1-forms to be 
linear functions of the basis vectors, or vice versa, i.e. 


w(v) = vw). (3.9) 


l In terms of information theory, the vector basis provides a systematic (n + 1)-tuple of numbers, 
which in turn provides an optimally compressed coding of directional information in the vector 
space. Without such a system, we would be stuck with names like north, south, east, west, 
north-north-west, north-north-north-west etc. for each new direction. 
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Vector components v! are written 

v = 0" es (3.10) 
and dual vector (1-form) components are written 

v= v,o". (3.11) 
The scalar product is 


v-v = Vv= (u,@")(v'e,) 


= vuv” (wey) 


= vuv” ôy 
=o," (3.12) 
where 
(œe) = ô% (3.13) 
The metric tensor g,,, maps between these equivalent descriptions: 
Vp = rr ie 
vt = gv, (3.14) 
and 
eu &y = Suv (3.15a) 
wo -@ = gt”. (3.15b) 


When acting on scalar functions, the basis vectors e,, > 0, are tangential to the 
vector space; the 1-forms œ” — dx” lie along it. 

In general, under an infinitesimal shift of the coordinate basis by an amount 
dx“, the basis changes by an amount 


de, = Tp €x dx”. (3.16) 


The symbol T in is called the affine connection, or Christoffel symbol. From 
this, one determines that 


drey = Tyv èr, (3.17) 
and by differentiating eqn. (3.13), one finds that 
do” = Ty ay. (3.18) 


The connection can be expressed in terms of the metric, by differentiating 
eqn. (3.15a): 


Ouv = dep ` Cy +e, - O,€y 
= Ty Bout Bond a (3.19) 
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By permuting indices in this equation, one may show that 


1 
Pie = 58 ldg + Ou8rv =, Aur} . (3.20) 


The connection is thus related to cases where the metric tensor is not constant. 
This occurs in various contexts, such when using curvilinear coordinates, and 
when fields undergo conformal transformations, such as in the case of gauge 
transformations. 


3.2.4 Example: Euclidean space 


In n-dimensional Euclidean space, the spatial indices i of a vector’s components 
run from | to n except where otherwise stated. The length of a vector interval 
ds is an invariant quantity, which is defined by the inner product. This may be 
written 


ds - ds = dx? + dy? + dz” (3.21) 
in a Cartesian basis. In the index notation (for n = 3) this may be written, 
ds - ds = dx'dx;. (3.22) 


Repeated indices are summed over, unless otherwise stated. We distinguish, in 
general, between vector components with raised indices (called contravariant 
components) and those with lower indices (called, confusingly, covariant 
components,” and ‘normal’ components, which we shall almost never use. In a 
Cartesian basis (x, y, z...) there is no difference between these components. In 
other coordinate systems, such as polar coordinates however, they are different. 

Results which are independent of coordinate basis always involve a sum over 
one raised index and one lower index. The length of the vector interval above 
is an example. We can convert an up index into a down index using a matrix 
(actually a tensor) called the metric tensor g;;, 


qj = gia’. (3.23) 


The inverse of the metric g;; is written g` (with indices raised), and it serves to 
convert a lower index into an upper one: 


a = gaj. (3.24) 
The metric and its inverse satisfy the relation, 


gjg =g =" (3.25) 


$; 


2 There is no connection between this designation and the usual meaning of covariant. 
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In Cartesian components, the components of the metric are trivial. It is simply 
the identity matrix, or Kronecker delta: 


(Cartesian) : gj; = 8” = ôi. (3.26) 


To illustrate the difference between covariant, contravariant and normal 
components, consider two-dimensional polar coordinates as an example. The 
vector interval, or line element, is now written 


ds - ds = dr? + r7d0”. (3.27) 


The normal components of the vector ds have the dimensions of length in this 
case, and are written 


(dr, rd@). (3.28) 
The contravariant components are simply the coordinate intervals, 
ds‘ = (dr, d0), (3.29) 
and the covariant components are 
ds; = (dr, r7d0). (3.30) 


The metric tensor is then defined by 


1 0 
ij = ( 0 r? ) , (3.31) 
and the inverse tensor is simply 
ij 1 0 
ij — 
gl= ( 0 r2 i: (3.32) 


The covariant and contravariant components are used almost exclusively in the 
theory of special relativity. 

Having introduced the metric tensor, we may define the scalar product of any 
two vectors a and b by 


a-b=a'b; =a'g;;b’. (3.33) 


The definition of the vector product and the curl are special to three space di- 
mensions. We define the completely anti-symmetric tensor in three dimensions 
by 


p +1 ijk = 123 and even permutations 
eJk = 2 —1 ijk = 321 and other odd permutations (3.34) 
O0 otherwise. 
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This is also referred to as the three-dimensional Levi-Cevita tensor in some 
texts. Since its value depends on permutations of 123, and its indices run only 
over these values, it can only be used to generate products in three dimensions. 
There are generalizations of this quantity for other numbers of dimensions, but 
the generalizations must always have the same number of indices as spatial 
dimensions, thus this object is unique in three dimensions. More properties 
of anti-symmetric tensors are described below. 

In terms of this tensor, we may write the ith covariant component of the three- 
dimensional vector cross-product as 


(b x ©); = cibi č. (3.35) 


Contracting with a scalar product gives the volume of a parallelepiped spanned 
by vectors a, b and c, 


a- (b x c) = eijka'bică, (3.36) 


which is basis-independent. 


3.2.5 Example: Minkowski spacetime 


The generalization of Euclidean space to relativistically motivated spacetime 
is called Minkowski spacetime. Close to the speed of light, the lengths of n- 
dimensional spatial vectors are not invariant under boosts (changes of speed), 
due to the Lorentz length contraction. From classical electromagnetism, one 
finds that the speed of light in a vacuum must be constant for all observers: 
2 1 
C=, (3.37) 
€oHo 


and one deduces from this that a new quantity is invariant; we refer to this as the 
invariant line element 


ds? = —c? dt? + dx? + dy? + dz? = —c? dr’, (3.38) 


where dt is referred to as the proper time. By comparing the middle and 
rightmost terms in this equation, it may be seen that the proper time is the 
time coordinate in the rest frame of a system, since there is no change in the 
position variables. The negative sign singles out the time contribution as special. 
The nomenclature ‘timelike separation’ is used for intervals in which ds? < 0, 
‘spacelike separation’ is used for ds? > 0, and ‘null’ is used for ds? = 0. 

In terms of (n + 1) dimensional vectors, one writes: 


ds? dx" dx, = dx” ede” (3.39) 
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where u, v = 0,1, 2,..., In a Cartesian basis, the contravariant and covariant 
components of the spacetime interval are defined, respectively, by 


dx” = (ct, x, y,Z,...) 
dx, = (HCL, xX, Y, Z, ..-), (3.40) 


and the metric tensor in this Cartesian basis, or locally inertial frame (LIF), is 
the constant tensor 


-1 0 0.. 0 
0 10 0 

Nuv = Suv i = 0 01 0 (3.41) 
0 O 0. 1 


This is a special case of a metric in a general frame gv. 

This placement of signs in the metric is arbitrary, and two other conventions 
are found in the literature: the opposite sign for the metric, with corresponding 
movement of the minus sign from the time to the space parts in the covariant 
and contravariant components; and a Euclidean formulation, in which the 
metric is entirely positive (positive definite), and the time components of 
the components are symmetrically ict. This last form, called a Euclidean 
formulation (or Riemannian in curved spacetime), has several uses, and thus 
we adopt conventions in this text in which it is trivial to convert to the Euclidean 
form and back. 

Contravariant vectors describe regular parametrizations of the coordinates. In 
order to define a frame-invariant derivative, we need to define partial derivatives 
by the requirement that the partial derivative of x! with respect to x; be unity: 


a 
za” =dx'=1. (3.42) 
X 


Notice that ‘dividing by’ an upper index makes it into an object with an 
effectively lower index. More generally, we require: 


ð 


a = ee ô. (3.43) 
X 


From this, one sees that the Cartesian components of the derivative must be 


1 
au = (2a. ax, Bidens) 


1 
gt = (=a, a dy, au). (3.44) 
Ç 
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Velocity is a relative concept, by definition. It is intimately associated with a 
choice of Lorentz frame. The relative velocity is defined as the time derivative 
of the position 


(3.45) 


Unfortunately, because both x“ and t are frame-dependent, this quantity does 
not transform like a vector. To obtain a vector, we choose to look at 


Ut = -—. (3.46) 


The components of the relative velocity are as follows: 
BY = (B°, B’) = (A, v'/c). (3.47) 
The relationship to the velocity vector is given by 
U" = ych". (3.48) 
Hence, 


U"U, = —c’. (3.49) 


3.3 Momentum space and waves 


The reciprocal wavevector space of k,, plays a complementary role to that of 
spacetime. It measures changes in waves when one is not interested in spacetime 
locations. Pure harmonic (sinusoidal) waves are spread over an infinite distance. 
They have no beginning or end, only a definite wavelength. 

In the quantum theory, energy and momentum are determined by the operators 


E — iho,, pi > —ihd;, (3.50) 
which have pure values when acting on plane wave states 
y ~ expi(kix' — wt). (3.51) 
In (n + 1) dimensional notation, the wavevector becomes: 
ky = (-2, ki) i (3.52) 
c 
so that plane waves take the simple form 


w ~ exp(ik,, x"). (3.53) 
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The energy and momentum are therefore given by the time and space eigenval- 
ues of the operator 


Py = —ihd,, (3.54) 


respectively, as they act upon a plane wave. This leads to the definition of an 
(n + 1) dimensional energy-momentum vector, 


E 
Pu = hk, = (-=. pi) : (3.55) 


The identification p, = ħk, is the de Broglie relation for matter waves. This is 
one of the most central and important relations in the definition of the quantum 
theory of matter. 

In discussing wavelike excitations, it is useful to resolve the components of 
vectors along the direction of motion of the wave (longitudinal) and perpen- 
dicular (transverse) to the direction of motion. A longitudinal vector is one 
proportional to a vector in the direction of motion of a wave k”. A transverse 
vector is orthogonal to this vector. The longitudinal and transverse components 
of a vector are defined by 


u kky 
V = Zp V 
i k”k, i 
Vr = Suv = p Vv’. (3.56) 
It is straightforward to verify that the two projection operators 
k¥k,, 
Re = A 
k”k, 
Pr = Suv — 2 Š (3.57) 
are orthogonal to one another: 
(PL) (Pr), = 0. (3.58) 


3.4 Tensor transformations 


Vector equations remain true in general coordinate frames because the com- 
ponents of a vector transform according to specific rules under a coordinate 
transformation U: 


v =Uv, (3.59) 
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or 
vi =U, (3.60) 


where the components of the matrix U are fixed by the requirement that the 
equations remain true in general coordinate systems. This is a valuable property, 
and we should be interested in generalizations of this idea which might be useful 
in physics. 

Tensors are objects with any number of indices, which have the same basic 
transformation properties as vectors. The number of indices on a tensor is its 
rank. Each free index in a tensor equation requires a transformation matrix 
under changes of coordinates; free indices represent the components in a specific 
coordinate basis, and each summed index is invariant since scalar products are 
independent of basis. 

Under a change of coordinates, x — x’, a scalar (rank O0-tensor) transforms 
simply as 


p(x) > P(x’). (3.61) 


For a vector (rank 1-tensor), such a simple rule does make sense. If one 
rotates a coordinate system, for instance, then all the components of a vector 
must change, since it points in a new direction with respect to the coordinate 
axes. Thus, a vector’s components must transform separately, but as linear 
combinations of the old components. The rule for a vector with raised index 
is: 


/ ax v IH v 
V”(x) = oat V°(x) = (px) V(X). (3.62) 
gY 
For a vector with lowered index, it is the converse: 
t ax” / v 
Vie) = zyn Vi (x) = (0x) Vo). (3.63) 


Here we have used two notations for the derivatives: the longhand notation first 
for clarity and the shorthand form which is more compact and is used throughout 
this book. 

The metric tensor is a tensor of rank 2. Using the property of the metric in 
raising and lowering indices, one can also deduce its transformation rule under 
the change of coordinates from x to x’. Starting with 


VEO’) = BOD), (3.64) 


and expressing it in the x coordinate system, using the transformation above, 
one obtains: 


(Oyx’")V" (x) = gh? (x')(O, x?) Vp (x). (3.65) 
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However, it is also known that, in the unprimed coordinates, 
V" (x) = g'° (x) V(x). (3.66) 


Comparing eqns. (3.65) and (3.66), it is possible to deduce the transformation 
rule for the inverse metric g””. To do this, one rearranges eqn. (3.65) by 
multiplying by (0/,x") and using the chain-rule: 


(3px 0x) = 8. (3.67) 


Being careful to re-label duplicate indices, this gives 


ôv t V”(x) = g (x) (3x7) 05x’) V (x), (3.68) 
which is 
V(x) = gl? OVO IV (3.69) 
Comparing this with eqn. (3.66), one finds that 
gP! x) (0L xT) = g(x), (3.70) 
or, equivalently, after re-labelling and re-arranging once more, 
g a’) = Opx") Oox) (x). (3.71) 


One sees that this follows the same pattern as the vector transformation with 
raised indices. The difference is that there is now a partial derivative matrix 
(3s x”) for each index. In fact, this is a general feature of tensors. Each raised 
index transforms with a factor like (0,x’”) and each lowered index transforms 
with a factor like ð, x”. For instance, 


TH!) = ux!) a VOO E r (3.72) 


3.5 Properties 


The following properties of tensors are instructive and useful. 


(1) Any matrix T may be written as a sum of a symmetric part Tij = +(T; ae 
T;;) and an anti-symmetric part T; = E(T; — T;:). Thus one may write 
any 2 x 2 matrix in the form 


Ti Tint T2 
LS fix f E (3.73) 
Ti2 — Ti2 Ta 


(2) It may be shown that the trace of the product of a symmetric matrix with 
; ; Shs . Giy 
an anti-symmetric matrix is zero, i.e. $` Tj; = 0. 
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(3) By considering similarity transformations of the form T —> A~'T A, one 
may show that the trace of any matrix is an invariant, equal to the sum of 
its eigenvalues. 


(4) By definition, a rank 2-tensor T transforms by the following matrix 
multiplication rule: 


T > ATTA, (3.74) 
for some transformation matrix A. Consider a general 2 x 2 tensor 


po St+ATy Tnu Ti 
TotT St + AT ' 


where ż is the trace t = (Tı + Tx), and consider the effect of the 
following matrices on T: 


Bee ae al 
m=a(j ER (3.75) 


For each of these matrices, compute: 
(a) ATA, 
(b) ATT A. 
It may be shown that, used as a transformation on T: 


(a) the anti-symmetric matrix A; leaves anti-symmetric terms invariant 
and preserves the trace of T; 


(b) the off-diagonal symmetric matrix A» leaves the off-diagonal sym- 
metric terms invariant and preserves the trace of T; 


(c) the symmetrical, traceless matrix A3, preserves only the trace of T. 


It may thus be concluded that a tensor T in n dimensions has three 
separately invariant parts and may be written in the form 


ee Be Gad 
fs eg aes Fy — Fy Oi : (3.76) 
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3.6 Euclidean and Riemannian spacetime 


Minkowski spacetime has an indefinite metric tensor signature. In Euclidean 
and Riemannian spacetime, the metric signature is definite (usually positive 
definite). General curved spaces with a definite signature are referred to 
as Riemannian manifolds. Multiplying the time components of vectors and 
tensors by the square-root of minus one (i) allows one to pass from Minkowski 
spacetime to Euclidean spacetime and back again. This procedure is known as 
Wick rotation and is encountered in several contexts in quantum theory. For 
instance, it serves a regulatory role: integrals involving the Lorentzian form 
(k? + m*)~! are conveniently evaluated in Euclidean space, where k? + m? 
has no zeros. Also, there is a convenient relationship between equilibrium 
thermodynamics and quantum field theory in Euclidean space. 

We shall use subscripts and superscripts ‘E’ to indicate quantities in Euclidean 
space; ‘M’ denotes Minkowski space, for this section only. Note that the 
transformation affects only the time or zeroth components of tensors; the space 
parts are unchanged. 

The transformation required to convert from Minkowski spacetime (with its 
indefinite metric) to Euclidean spacetime (with its definite metric) is motivated 
by the appearance of plane waves in the Fourier decomposition of field variables. 
Integrals over plane waves of the form exp i(k - x — wt) have no definite 
convergence properties, since the complex exponential simply oscillates for 
all values of k and w. However, if one adds a small imaginary part to time 
t — t — it, then we turn the oscillatory behaviour into exponential decay: 


elk x-ot) L y gi(k-x-ot),-or (3.77) 


The requirement of decay rather than growth chooses the sign for the Wick 
rotation. An equivalent motivation is to examine the Lorentzian form: 


1 1 1 
k?+m? ke +k? +m? (—ko + V2 + m?) (ko + VK? + m2) 


(3.78) 


This is singular and has poles on the real ko axis at ko = +Vk*+m?. This 
makes the integral of ko non-analytical, and a prescription must be specified for 
integrating around the poles. The problem can be resolved by adding a small 
(infinitesimal) imaginary part to the momenta: 


1 1 
k? +m? —ie  (—ko — ie + vK? + m2) (ko — ie + V/k? + m2) 


(3.79) 


This effectively shifts the poles from the real axis to above the axis for negative 
ko and below the axis for positive kọ. Since it is possible to rotate the contour 
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90 degrees onto the imaginary axis without having to pass through any poles, by 
defining (see section 6.1.1) 


ké = iko, (3.80) 


this once again chooses the sign of the rotation. The contour is rotated clockwise 
by 90 degrees, the integrand is positive definite and no poles are encountered in 
an integral over Ko: 


1 1 
—kj + k? + m? — ie a kop + kK? + m? — 
All variables in a field theory must be rotated consistently: 
xp = —ix® (3.82) 
xg = ixo (3.83) 
ké = iko = —iw/c. (3.84) 
The inner product 
kx" = k - x + kox? > k - x + Kox? (3.85) 
is consistent with 
Oo = Ə xl = 1 (3.86) 
where 
af = ido, (3.87) 


since of — ikọ. Since the Wick transformation affects derivatives and vectors, 
it also affects Maxwell’s equations. From 


3” Fav = HoJu, (3.88) 

we deduce that 
JE =ido (3.89) 
Ab = iAo, (3.90) 


which are necessary in view of the homogeneous form of the field strength: 
—i FẸ = ðA; — 0;A0 = Foi. (3.91) 


Notice that, in (3 + 1) dimensions, this means that 


1 E? E2 
5 Pt Fup = G z =) = G + Ez) ; (3.92) 
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Notice how the Euclideanized Lagrangian takes on the appearance of a Hamilto- 
nian. This result is the key to relating Wick-rotated field theory to thermodynam- 
ical partition functions. It works because the quantum phase factor exp(iSy//i) 
looks like the partition function, or statistical weight factor exp(—8B Hm) when 
Wick-rotated: 


Sg = —iSm, (3.93) 


since the volume measure dVF = —idV,. The superficial form of the 
Lagrangian density is unchanged in theories with only quadratic derivatives 
provided everything is written in terms of summed indices, but internally all 
of the time-summed terms have changed sign. Thus, one has that 


S S 1 
exp G) = exp (-) ~ exp (-; J dVg Hu) ; (3.94) 


A Euclideanized invariant becomes something which looks like a Minkowski 
space non-invariant. The invariant F?, which is used to deduce the dynamics of 
electromagnetism, transformed into Euclidean space, resembles a non-invariant 
of Minkowski space called the Hamiltonian, or total energy function (see 
eqn. (2.70)). This has physical as well as practical implications for field theories 
at finite temperature. If one takes the Euclidean time to be an integral from zero 
to HB and take H = f do H, 


S 1 
exp (i) = exp (-5 4) ; (3.95) 


then a Euclidean field theory phase factor resembles a Minkowski space, finite- 
temperature Boltzmann factor. This is discussed further in chapter 6. 
In a Cartesian basis, one has 


Suv > Biy = Suv: (3.96) 
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The action principle 


The variational principle is central to covariant field theory. It displays 
symmetries, field equations and continuity conditions on an equal footing. It 
can be used as the starting point for every field theoretical analysis. In older 
books, the method is referred to as Hamilton’s principle. In field theory it is 
referred to more colloquially as the action principle. Put plainly, it is a method 
of generating functionals; it compresses all of the kinematics and dynamics of a 
physical theory into a single integral expression S called the action. 

The advantage of the action principle is that it guarantees a well formulated 
dynamical problem, assuming only the existence of a set of parameters on 
which the dynamical variables depends. Any theory formulated as, and derived 
from an action principle, automatically leads to a complete dynamical system 
of equations with dynamical variables which play the roles of positions and 
momenta, by analogy with Newtonian mechanics. To formulate a new model 
in physics, all one does is formulate invariant physical properties in the form of 
an action, and the principle elucidates the resulting kinematical and dynamical 
structure in detail. 


4.1 The action in Newtonian particle mechanics 


Consider a system consisting of a particle with position g(t) and momentum 
p(t). The kinetic energy of the particle is 


T id? (4.1) 

=-—m 4 š 
5 q 

and the potential energy is simply denoted V (q). The ‘dot’ over the q denotes 

the time derivative, or 


. 4q 


q = T (4.2) 
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Classical mechanics holds that the equation of motion for a classical particle is 
Newton’s law: 


F=mg=—— (4.3) 


but it is interesting to be able to derive this equation from a general principle. 
If many equations of motion could be derived from a common principle, it 
would represent a significant compression of information in physics. This is 
accomplished by introducing a generating function L called the Lagrangian. 
For a conservative system, the Lagrangian is defined by 


L=T-V, (4.4) 


which, in this case, becomes 


L = smd? - V4). (4.5) 

This form, kinetic energy minus potential energy, is a coincidence. It does not 

apply to all Lagrangians. In relativistic theories, for instance, it is not even clear 

what one should refer to as the kinetic and potential energies. The Lagrangian 
is a generating function; it has no unique physical interpretation. 

The Lagrangian is formally a function of q and q. The general rule for 

obtaining the equations of motion is the well known Euler-Lagrange equations. 


They are 
ƏL d /daL 
—-——|[|--]}]=0. (4.6) 
dq dt \dq 


If the physical system is changed, one only has to change the Lagrangian: the 
general rule will remain true. Evaluating, in this case, 


aL dv 

ðq dq 

aL , 

— =m, (4.7) 
ðq 


one obtains the field equations (4.3), as promised. 

Is this approach better than a method in which one simply writes down the 
field equations? Rather than changing the field equations for each case, one 
instead changes the Lagrangian. Moreover, eqn. (4.6) was pulled out of a hat, 
so really there are two unknowns now instead of one! To see why this approach 
has more to offer, we introduce the action. 
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4.1.1 Variational principle 


The fact that one can derive known equations of motion from an arbitrary 
formula involving a constructed function L is not at all surprising — there are 
hundreds of possibilities; indeed, the motivation for such an arbitrary procedure 
is not clear. The fact that one can obtain them from a function involving 
only the potential and kinetic energies of the system, for any conservative 
system, is interesting. What is remarkable is the fact that one can derive the 
Euler-Lagrange equations (i.e. the equations of motion), together with many 
other important physical properties for any system, from one simple principle: 
the action principle. 
Consider the action S from the Lagrangian by 


t2 
ti 


The action has (naturally) dimensions of action or ‘energy x time’, and is 
thought of as being a property of the path q(t) of our particle between the 
fixed points q (tı) and q (t2). The action has no physical significance in itself. 
Its significance lies instead in the fact that it is a generating functional for the 
dynamical properties of a physical system. 

When formulating physics using the action, it is not necessary to consider the 
fact that q and g are independent variables: that is taken care of automatically. In 
fact, the beauty of the action principle is that all of the useful information about 
a physical system falls out of the action principle more or less automatically. 

To extract information from S, one varies it with respect to its dynamical 
variables, i.e. one examines how the integral changes when the key variables in 
the problem are changed. The details one can change are f and f2, the end-points 
of integration, and q(t), the path or world-line of the particle between those two 
points (see figure 4.1). Note however that Q(t) is the path the particle would 
take from A to B, and that is not arbitrary: it is determined by, or determines, 
physical law, depending on one’s view. So, in order to make the variational 
principle a useful device, we have to be able to select the correct path by some 
simple criterion. 

Remarkably, the criterion is the same in every case: one chooses the path 
which minimizes (or more correctly: makes stationary) the action; i.e. we look 
for paths q(t) satisfying 

ôS 
ôq (t) 
These are the stable or stationary solutions to the variational problem. This tells 
us that most physical laws can be thought of as regions of stability in a space 


of all solutions. The action behaves like a potential, or stability measure, in this 
space. 


(4.9) 
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It is an attractive human idea (Occam’s razor) that physical systems do the 
‘least action’ possible; however, eqn. (4.9) is clearly no ordinary differentiation. 
First of all, S is a scalar number — it is integrated over a dummy variable t, 
so ft is certainly not a variable on which S depends. To distinguish this from 
ordinary differentiation of a function with respect to a variable, it is referred to as 
functional differentiation because it is differentiation with respect to a function. 

The functional variation of S with respect to q(t) is defined by 


ôS = S[q + ôq] — Siq], (4.10) 


where ôq (t) is an infinitesimal change in the form of the function q at time t. 
Specifically, for the single-particle example, 


ôS = fa [ma +84? —V(q +60} -fa [ima 7 væ} (4.11) 


Now, since 6g is infinitesimal, we keep only the first-order contributions, so on 
expanding the potential to first order as a Taylor series about q (t), 


dV 
Vq + ôq) = VG a (4.12) 


one obtains the first-order variation of S, 


ôS = fo [nioso — Kaal : (4.13) 
dq 


A ‘dot’ has been exchanged for an explicit time derivative to emphasize the 
time derivative of ôq. Looking at this expression, one notices that, if the time 
derivative did not act on ôq, we would be able to take out an overall factor of 
ôq, and we would be almost ready to move ôq to the left hand side to make 
something like a derivative. Since we are now operating under the integral sign, 
it is possible to integrate by parts, using the property: 


| aaam = [as] - f aaae, (4.14) 


so that the time derivative can be removed from ôq, giving: 


z dV } 
see fa |-mio = | q(t) + [mg -840| (4.15) 


dq (t) 
The stationary action criterion tells us that 6S = 0. Assuming that q (t) is not 
always zero, one obtains a restriction on the allowed values of g(t). This result 
must now be interpreted. 


t2 
t 
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Fig. 4.1. The variational formula selects the path from A to B with a stationary value 
of the action. Stationary or minimum means that the solution is stable on the surface of 
all field solutions. Unless one adds additional perturbations in the action, it will describe 
the ‘steady state’ behaviour of the system. 


4.1.2 5S: equation of motion 


The first thing to notice about eqn. (4.15) is that it is composed of two logically 
separate parts. The first term is an integral over all times which interpolate 
between ¢, and t2, and the second is a term which lives only at the end-points. 
Now, suppose we ask the question: what path q(t) is picked out by the action 
principle, if we consider all the possible variations of paths g(t) + ôq (t), given 
that the two end-points are always fixed, i.e. ôq (t1) = 0 and ôq (t2) = 0? 

The requirement of fixed end-points now makes the second term in eqn. (4.15) 
vanish, so that 6S = 0 implies that the contents of the remaining curly braces 
must vanish. This gives precisely the equation of motion 


mG = -—. (4.16) 


The action principle delivers the required formula as promised. This arises from 
an equation of constraint on the path q (t) — a constraint which forces the path to 
take a value satisfying the equation of motion. This notion of constraint recurs 
later, in more advanced uses of the action principle. 


4.1.3 The Euler-Lagrange equations 


The Euler-Lagrange equations of motion are trivially derived from the action 
principle for an arbitrary Lagrangian which is a function of q and q. The action 
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one requires is simply 


s= f arao iw) (4.17) 
and its variation can be written, using the functional chain-rule, 
s= faji ôq + on 6(0;q) (4.18) 
= Sg) T 


The variation of the path commutes with the time derivative (trivially), since 
ô(dq) = q(t + ôT) — ðq (T) = 3; (8q). (4.19) 


Thus, one may re-write eqn. (4.18) as 
ôS = fa ors + —— SB ne 9 (ôq) (4.20) 
= ôq ôq 50,4) tq . 


Integrating the second term by parts, one obtains 


s= fa [Zo — ð (G )e I} + fao a |=0 
= qt T 4 50g) 2] 


(4.21) 


The second term vanishes independently (since its variation is zero at the fixed 
end-points), and thus one obtains the Euler-Lagrange equations (4.6). 


4.1.4 85S: continuity 


Before leaving this simple world of classical particles, there is one more thing to 
remark about eqn. (4.21). Consider the second term; when one asks the question: 
what is the condition on q(t) for the classical trajectories with stationary action 
and fixed end-points? — this term drops out. It vanishes by assumption. It 
contains useful information however. If we consider the example of a single 
particle, the surface term has the form 


mq - ôq = pôq. (4.22) 

This term represents the momentum of the particle. For a general Lagrangian, 
one can use this fact to define a ‘generalized momentum’. From eqn. (4.21) 

ôL 

ôa) 


Traditionally, this quantity is called the canonical momentum, or conjugate 
momentum, and is denoted generically as TI. 


p= (4.23) 
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l> B 


A 


Fig. 4.2. The continuity of paths obeying the equations of motion, over an infinitesi- 
mal interval is assured by the null variation of the action over that interval. 


Suppose one asks a different question of the variation. Consider only an 
infinitesimal time period h — tı = €, where € —> 0. What happens between 
the two limits of integration in eqn. (4.21) is now less important. In fact, it 
becomes decreasingly important as € — 0, since 


5Si2 = [ pdq];; + O(6). (4.24) 


What infinitesimal property of the action ensures that 6S = 0 for all intermediate 
points between the limits t; and t2? To find out, we relax the condition that the 
end-points of variation should vanish. Then, over any infinitesimal interval e, 
the change in ôq (t) can itself only be infinitesimal, unless q (t) is singular, but 
it need not vanish. However, as € — 0, the change in this quantity must also 
vanish as long as q(t) is a smooth field, so one must take A(Sq) = 0.' This 
means that 


Ap = p(t) — p(t) = 0; (4.25) 


i.e. the change in momentum across any infinitesimal surface is zero, or 
momentum is conserved at any point. This is a continuity condition on q (t). 
To see this, ask what would happen if the potential V (q) contained a singular 
term at the surface: 


V(q,t) =d(t -HNAV+VQ@q), (4.26) 


' Note that we are assuming that the field is a continuous function, but the momentum need 
not be strictly continuous if there are impulsive forces (influences) on the field. This is fully 
consistent with our new philosophy of treating the ‘field’ g as a fundamental variable, and p 
as a derived quantity. 
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where F(t + t) is the mid-point of the infinitesimal interval. Here, the delta 
function integrates out immediately, leaving an explicit surface contribution 
from the potential, in addition to the term from the integration by parts: 


dAV r 
ôSi2 = d +[ pdq];, + O(e) =0, (4.27) 


Provided AV is finite, using the same argument as before, one obtains, 
Ap = -——., (4.28) 


i.e. the change in momentum across any surface is a direct consequence of the 
impulsive force dA V /dg at that surface. 

We thus have another facet of the action: it evaluates relationships between 
dynamical variables which satisfy the constraints of stable behaviour. This 
property of the action is very useful: it generates standard continuity and 
boundary conditions in field theory, and is the backbone of the canonical 
formulation of both classical and quantum mechanics. For instance, in the 
case of the electromagnetic field, we can generate all of the ‘electromagnetic 
boundary conditions’ at interfaces using this technique (see section 21.2.2). This 
issue occurs more generally in connection with the energy-momentum tensor, 
in chapter 11, where we shall re-visit and formalize this argument. 


4.1.5 Relativistic point particles 


The relativistically invariant form of the action for a single point particle is 


(4.29) 


1 dxi) dx/(t) 
=— dt./— = ii 4 
° J zo| ea a 


The particle positions trace out world-lines g(t) = x(t). If we re-express this 
in terms of the proper time t of the particle, where 


1 


T=ty 

y =1/V0 — B’) 

2 ae Oe Ac 

B =5-3(5) , (4.30) 


then the action may now be written in the frame of the particle, 


dt > ydt 


VE > YVE, (4.31) 
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giving 
1 /d ? 
S= [ove {-3" (=) + v=] . (4.32) 


The field equations are therefore 


ôx mir ax 0, (haa 
i.e. 
F = ma, (4.34) 
where 
F=-VV' 
= 7 (4.35) 


The conjugate momentum from the continuity condition is 


dx 


—, 4.36 
E (4.36) 


p=m 
which is simply the relativistic momentum vector p. See section 11.3.1 for the 
energy of the classical particle system. 
In the above derivation, we have treated the metric tensor as a constant, but in 
curved spacetime g,,, depends on the coordinates. In that case, the variation of 
the action leads to the field equation 


d dx” lig pear =, (4.37) 
dr (EP dr) 28o) dr de ; 


The equation of a free particle on a curved spacetime is called the geodesic 
equation. After some manipulation, it may be written 
dx! u dx” dx? 
dr? "P dr dt 


=0. (4.38) 


Interestingly, this equation can be obtained from the absurdly simple variational 
principle: 


5 J ds = 0, (4.39) 


where ds is the line element, described in section 3.2.5. See also section 25.4. 
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4.2 Frictional forces and dissipation 


In many branches of physics, phenomenological equations are used for the 
dissipation of energy. Friction and ohmic resistance are two common examples. 
Empirical frictional forces cannot be represented by a microscopic action 
principle, since they arise physically only through time-dependent boundary 
conditions on the system. No fundamental dynamical system is dissipative at 
the microscopic level; however, fluctuations in dynamical variables, averaged 
over time, can lead to a re-distribution of energy within a system, and this is 
what leads to dissipation of energy from one part of a system to another. More 
advanced statistical notions are required to discuss dissipation fully, but a few 
simple observations can be made at the level of the action. 
Consider the example of the frictional force represented by Langevin’s 
equation: 
dx ; 
arr +ax = F(t). (4.40) 


Initially it appears as though one could write the action in the following way: 


s= fa 1 dx)" 1 dx aai 
= 2” (ar a a (* 


However, if one varies this action with respect to x, the term proportional to a 
gives 


d d 
dt a | dx — —ôx |. 4.42 
J a (ariari x) (4.42) 


But this term is a total derivative. Integrating by parts yields 
d b 
[of ġol =o, (4.43) 


which may be ignored, since it exists only on the boundary. Because of the 
reversibility of the action principle, one cannot introduce terms which pick out a 
special direction in time. The only place where such terms can appear is through 
boundary conditions. For the same reason, it is impossible to represent Ohm’s 
law 


Ji=oE' (4.44) 


in an action principle. An ohmic resistor has to dissipate heat as current passes 
through it. 

In some cases, the action principle can tricked into giving a non-zero con- 
tribution from velocity-dependent terms by multiplying the whole Lagrangian 
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with an ‘integrating factor’ exp(y(t)), but the resulting field equations require 
y(t) to make the whole action decay exponentially, and often the results are 
ambiguous and not physically motivated. 

We shall return to the issue of dissipation in detail in chapter 6 and show the 
beginnings of how physical boundary conditions and statistical averages can be 
incorporated into the action principle, in a consistent manner, employing the 
principle of causality. It is instructive to show that it is not possible to write 
down a gauge-invariant action for the equation 


Ji=oE'. (4.45) 


i.e. Ohm’s law, in terms of the vector potential A,,. The equation is only an 
effective representation of an averaged statistical effect, because it does provide 
a reversible description of the underlying physics. 


(1) By varying with respect to A,,, one may show that the action 
S= J (dx) {J A; — oj A'E) } (4.46) 


with E; = —d,A; — 0;Ao, does not give eqn. (4.45). If one postulates 
that E' and Jİ may be replaced by their steady state (time-independent) 
averages (E') and (J'), then we can show that this does give the correct 
equation. This is an indication that some averaging procedure might be 
the key to representing dissipative properties of bulk matter. 


(2) Consider the action 


wa 


S= J (dx) {J"A, — oj A Eie Y" e), (4.47) 


This may be varied with respect to Ap and A; to find the equations of 
motion; gauge invariance requires the equations to be independent of the 
vector potential A,,. On taking oj; = o4;;, one can show that gauge 
invariance requires that the vector potential decay exponentially. Readers 
are encouraged to check whether the resulting equations of motion are a 
satisfactory representation of Ohm’s law. 


4.3 Functional differentiation 


It is useful to define the concept of functional differentiation, which is to 
ordinary differentiation what ôq (t) is to dg. Functional differentiation differs 
from normal differentiation in some important ways. 

The ordinary derivative of a function with respect to its control variable is 
defined by 


df(t) _ lim POL) = fH 
dt 41-30 ôt ` 


(4.48) 
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It tells us about how a function changes with respect to the value of its control 
variable at a given point. Functional differentiation, on the other hand, is 
something one does to an integral expression; it is performed with respect to 
a function of some variable of integration. The ‘point of differentiation’ is now 
a function f(t) evaluated at a special value of its control variable t’. It takes 
some value from within the limits of the integral. So, whereas we start with a 
quantity which is not a function of ¢ or t’, the result of the functional derivation 
is a function which is evaluated at the point of differentiation. Consider, as an 
example, the arbitrary functional 


F= far Daron". (4.49) 
This is clearly not a function of ¢ due to the integral. The variation of such a 


functional F[f] is given by 
6F[f] = FIFE +f] — FIFO). (4.50) 


We define the functional derivative by 


ôF FUF@ +660 -O FISO] 
= lim 1< —_—_— 
Sf) «0 € 


(4.51) 


This is a function, because an extra variable t’ has been introduced. You can 
check that this has the unusual side effect that 


ôq (t) 
ôq Ct’) 
which is logical (since we expect the derivative to differ from zero only if the 
function is evaluated at the same point), but unusual, since the right hand side is 


not dimensionless — in spite of the fact that the left hand side seems to be. On 
the other hand, if we define a functional 


= 6(t —f'), (4.52) 


O= fiw (4.53) 
then we have 
80 fo eO OR 
3ga) = Jež = fse t)=1. (4.54) 


Thus, the integral plays a key part in the definition of differentiation for 
functionals. 


4.4 The action in covariant field theory 


The action principle can be extended to generally covariant field theories. This 
generalization is trivial in practice. An important difference is that field theories 
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are defined in terms of variables which depend not only on time but also on 
space; @(x,t) = (x). This means that the action, which must be a scalar, 
without functional dependence, must also be integrated over space in addition 
to time. Since the final action should have the dimensions of energy x time, this 
means that the Lagrangian is to be replaced by a Lagrangian density £ 


£ 


S= f (dx) L(x, t), dup (X, t), x). (4.55) 


The integral measure is (dx) = dV,/c, where dV, = cdtd"x,/g = dxd” x /g. 
Although it would be nice to use dV, here (since this is the Minkowski space 
volume element), this is not possible if £ is an energy density and S is to have the 
dimensions of action.? The non-relativistic action principle has already chosen 
this convention for us. The special role played by time forces is also manifest in 
that the volume is taken between an earlier time ¢ and a later time t’ — or, more 
correctly, from one spacelike hyper-surface, o, to another, o”. 


The classical interpretation of the action as the integral over T — V, the kinetic 
energy minus the potential energy, does not apply in the general case. The 
Lagrangian density has no direct physical interpretation, it is merely an artefact 
which gives the correct equations of motion. What is important, however, is 
how one defines a Hamiltonian, or energy functional, from the action. The 
Hamiltonian is related to measurable quantities, namely the total energy of the 
system at a given time, and it is responsible for the time development of the 
system. One must be careful to use consistent definitions, e.g. by sticking to the 
notation and conventions used in this book. 


Another important difference between field theory and particle mechanics is 
the role of position. Particle mechanics describes the trajectories of particles, 
q(t), as a function of time. The position was a function with time as a parameter. 
In field theory, however, space and time are independent parameters, on a par 
with one another, and the ambient field is a function which depends on both 
of them. In particle mechanics, the action principle determines the equation 
for a constrained path q(t); the field theoretical action principle determines an 
equation for a field which simultaneously exists at all spacetime points, i.e. 
it does not single out any trajectory in spacetime, but rather a set of allowed 
solutions for an omnipresent field space. In spite of this difference, the formal 
properties of the action principle are identical, but for an extra integration: 


2 One could absorb a factor of c into the definition of the field (x), since its dimensions are 
not defined, but this would then mean that the Lagrangian and Hamiltonian would not have 
the dimensions of energy. This blemish on the otherwise beautiful notation is eliminated when 
one chooses natural units in which c = 1. 
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4.4.1 Field equations and continuity 


For illustrative purposes, consider the following action: 


1 1 
S= J aw fZ OH OA) + 5m? — Jo}, (4.56) 


where dV, = cdtdx. Assuming that the variables ¢(x) commute with one 
another, the variation of this action is given by 


ss J (dx) | (056) 0,8) Em pid= 159}. (4.57) 


Integrating this by parts and using the commutativity of the field, one has 


<= fel -0¢ġ+m¢-— J} + J do” 56 (8.6). (4.58) 


From the general arguments given earlier, one recognizes a piece which is purely 
a surface integral and a piece which applies the field in a general volume of 
spacetime. These terms vanish separately. This immediately results in the field 
equations of the system, 


(—O + m’)o(x) = J(x), (4.59) 


and a continuity condition which we shall return to presently. 

The procedure can be reproduced for a general Lagrangian density £ and 
gives the Euler-Lagrange equations for a field. Taking the general form of the 
action in eqn. (4.55), one may write the first variation 


ac ac 
8S = J an ag? + aang A (4.60) 


Now, the variation symbol and the derivative commute with one another since 
they are defined in the same way: 


ðh = ðh (x + Ax) — ðh (x) 
= ô (ð H); (4.61) 


thus, one may integrate by parts to obtain 


al al 1 al 
ôS = fool # — On (san) | + E faot êp (san) (4.62) 


The first of these terms exists for every spacetime point in the volume of 
integration, whereas the second is restricted only to the bounding hyper-surfaces 
o and o’. These two terms must therefore vanish independently in general. 
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The vanishing integrand of the first term gives the Euler-Lagrange equations of 
motion for the field 


aL OL 
— — 0, | =~ ] = 9. (4.63) 
ap arp) 
and the vanishing of the second term leads to the boundary continuity condition, 
al 
A | ôġ = 0. (4.64) 
a (Or p) 


If this result is compared with eqns. (4.22) and (4.23), an analogous ‘momen- 
tum’, or conjugate variable to the field (x), can be defined. This conjugate 
variable is unusually denoted TI (x): 


T(x) = (4.65) 


ôL 
a (8PH) 
and is derived by taking the canonical spacelike hyper-surface with o = 0. Note 


the position of indices such that the variable transforms like a covariant vector 
p = oq. The covariant generalization of this is 


SL 
a(a7p) 


H (x) = (4.66) 


4.4.2 Uniqueness of the action 


In deriving everything from the action principle, one could gain the impression 
that there is a unique prescription at work. This is not the case. The definition 
of the action itself is not unique. There is always an infinity of actions 
which generates the correct equations of motion. This infinity is obtained by 
multiplying the action by an arbitrary complex number. In addition to this trivial 
change, there may be several actions which give equivalent results depending on 
(i) what we take the object of variation to be, and (ii) what we wish to deduce 
from the action principle. For example, we might choose to re-parametrize the 
action using new variables. The object of variation and its conjugate are then 
re-defined. 

It is clear from eqn. (4.21) that the field equations and boundary conditions 
would be the same if one were to re-define the Lagrangian by multiplying by a 
general complex number: 


S — (a+ib)S. (4.67) 


The complex factor would simply cancel out of the field equations and boundary 
conditions. Moreover, the Lagrangian itself has no physical meaning, so there 
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is no physical impediment to such a re-definition. In spite of this, it is normal 
to choose the action to be real. The main reason for this is that this choice 
makes for a clean relationship between the Lagrangian and a new object, the 
Hamiltonian, which is related to the energy of the system and is therefore, by 
assumption, a real quantity. 

Except in the case of the gravitational field, one is also free to add a term on 
to the action which is independent of the field variables, since this is always zero 
with respect to variations in the fields: 


S— S+ fæ A. (4.68) 


Such a term is often called a cosmological constant, because it was introduced 
by Einstein into the theory of relativity in order to create a static (non-expansive) 
cosmology. Variations of the action with respect to the metric are not invariant 
under the addition of this term, so the energy-momentum tensor in chapter 11 
is not invariant under this change, in general. Since the Lagrangian density is an 
energy density (up to a factor of c), the addition of this arbitrary term in a flat 
(gravitation-free) spacetime simply reflects the freedom one has in choosing an 
origin for the scale of energy density for the field.’ 

Another way in which the action can be re-defined is by the addition of a total 
derivative, 


Saga farag 


=S + [cornet (4.69) 


The additional term exists only on the boundaries o of the volume integral. 
By assumption, the surface term vanishes independently of the rest, thus, since 
the field equations are defined entirely from the non-surface contributions, they 
will never be affected by the addition of such a total derivative. However, 
the boundary conditions or continuity will depend on this addition. This has 
a physical interpretation: if the boundary of a physical system involves a 
discontinuous change, it implies the action of an external agent at the boundary. 
Such a jump is called a contact potential. It might signify the connection of a 
system to an external potential source (a battery attached by leads, for instance). 
The connection of a battery to a physical system clearly does not change the laws 
of physics (equations of motion) in the system, but it does change the boundary 
conditions. 

In light of this observation, we must be cautious to write down a ‘neutral’, 
or unbiased action for free systems. This places a requirement on the action, 


3 Indeed, the action principle 6S = 0 can be interpreted as saying that only potential differences 
are physical. The action potential itself has no unique physical interpretation. 
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namely that the action must be Hermitian, time-reversal-invariant, or symmetri- 
cal with respect to the placement of derivatives, so that, if we let £ — —t, then 
nothing is changed. For instance, one writes 


(0"¢)(0.@) insteadof ¢(—0¢), (4.70) 


for quadratic derivatives, and 


1 2 1 
ae ð: $) = 5 Pd) —(0,6")b)  insteadof $°0,6, (4.71) 


in the case of linear derivatives. These alternatives differ only by an integration 
by parts, but the symmetry is essential for the correct interpretation of the action 
principle as presented. This point recurs in more detail in section 10.3.1. 


4.4.3 Limitations of the action principle 


In 1887, Helmholtz showed that an equation of motion can only be derived from 
Lagrange’s equations of motion (4.6) if the generalized force can be written 


F; = see. 
neg, dt 0g; 


(4.72) 


where V = V (q, 4q, t) is the potential L = T — V, and the following identities 
are satisfied: 


OF; OF; 
ðğ; 9qi 
OF; R dF; d (dF, | OF; 
dg; Ogi dt \agj  Əği 
d (dF, OF; 
0;F; -— 0,F; = —|—-— (4.73) 
dt \0q; aqi 


For a review and discussion of these conditions, see ref. [67]. These relations lie 
at the core of Feynman’s ‘proof’ of Maxwell ’s equations [42, 74]. Although they 
are couched in a form which derives from the historical approach of varying the 
action with respect to the coordinate q; and its associated velocity, g;, separately, 
their covariant generalization effectively summarizes the limits of generalized 
force which can be derived from a local action principle, even using the approach 
taken here. Is this a significant limitation of the action principle? 

Ohm’s law is an example where a Lagrangian formulation does not work 
convincingly. What characterizes Ohm’s law is that it is a substantive rela- 
tionship between large-scale averages, derived from a deeper theory, whose 
actual dynamics are hidden and approximated at several levels. The relation 
summarizes a coarse average result of limited validity. Ohm’s law cannot be 
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derived from symmetry principles, only from a theory with complex hidden 
variables. The deeper theory from which it derives (classical electrodynamics 
and linear response theory) does have an action principle formulation however. 

Ohm’s law is an example of how irreversibility enters into physics. The 
equations of fundamental physics are reversible because they deal only with 
infinitesimal changes. An infinitesimal interval, by assumption, explores so 
little of its surrounding phase space that changes are trivially reversed. This 
is the main reason why a generating functional (action) formulation is so 
successful at generating equations of motion: it is simply a mechanism for 
exploring the differential structure of the action potential-surface in a local 
region; the action is a definition of a conservation book-keeping parameter 
(essentially energy), parametrized in terms of field variables. The reversible, 
differential structure ensures conservation and generates all of the familiar 
quantities such as momentum. Irreversibility arises only when infinitesimal 
changes are compounded into significant changes; i.e. when one is able to 
explore the larger part of the phase space and take account of long-term history 
of a system. The methods of statistical field theory (closed time path [116] 
and density matrices [49]) may be used to study long-term change, based on 
sums of differential changes. Only in this way can one relate differential law to 
macroscopic change. 

Another way of expressing the above is that the action principle provides a 
concise formulation of Markov processes, or processes whose behaviour now 
is independent of what happened in their past. Non-Markov processes, or 
processes whose behaviour now depends on what happened to them earlier, 
require additional long-term information, which can only be described by the 
combination of many infinitesimal changes. 

Clearly, it is possible to write down equations which cannot be easily derived 
from an action principle. The question is whether such equations are of 
interest to physics. Some of them are (such as Ohm’s law), but these only fail 
because, employing an action principle formulation of a high-level emergent 
phenomenon ignores the actual energy accounting taking place in the system. 
If one jumps in at the level of an effective field theory, one is not guaranteed 
an effective energy parameter which obeys the reversible accounting rules of 
the action principle. If an action principle formulation fails to make sense, 
it is possible to go to a deeper, more microscopic theory and re-gain an 
action formulation, thereby gaining a more fundamental (though perhaps more 
involved) understanding of the problem. 

So are there any fundamental, elementary processes which cannot be derived 
from an action principle? The answer is probably not. Indeed, today all 
formulations of elementary physics assume an action principle formulation at 
the outset. What one can say in general is that any theory derived from an 
action principle, based on local fields, will lead to a well defined problem, 
within a natural, covariant formulation. This does not guarantee any prescription 
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understanding physical phenomena, but it does faithfully generate differential 
formulations which satisfy the symmetry principle. 


4.4.4 Higher derivatives 


Another possibility which is not considered in this book is that of higher 
derivative terms. The actions used here are at most quadratic in the derivatives. 
Particularly in speculative gravitational field theories, higher derivative terms do 
occur in the literature (often through terms quadratic in the curvature, such as 
Gauss—Bonnet terms or Weyl couplings); these are motivated by geometrical or 
topological considerations, and are therefore ‘natural’ to consider. Postulating 
higher order derivative terms is usually not useful in other contexts. 

Higher derivative terms are often problematic, for several reasons. The 
main reason is that they lead to acausal solutions and ‘ghost’ excitations, 
or to field modes which appear to be solutions, but which actually do not 
correspond to physical propagations. In the quantum field theory, they are 
non-renormalizable. Although none of these problems is itself sufficient to 
disregard higher derivatives entirely, it limits their physical significance and 
usefulness. Some higher derivative theories can be factorized and expressed 
as coupled local fields with no more than quadratic derivatives; thus, a difficult 
action may be re-written as a simpler action, in a different formulation. This 
occurs, for instance, if the theories arise from non-local self-energy terms. 


4.5 Dynamical and non-dynamical variations 


It is convenient to distinguish between two kinds of variations of tensor quanti- 
ties. These occur in the derivation of field equations and symmetry generators, 
such as energy and momentum, from the action. 


4.5.1 Scalar fields 


The first kind of variation is a dynamical variation; it has been used implicitly 
up to now. A dynamical variation of an object q is defined by 


ôq =q'(x) — q(x). (4.74) 


This represents a change in the function q(x) at constant position x. It is like 
the ‘rubber-banding’ of a function into a new function: a parabola into a cubic 
curve, and so on. 

The other kind of variation is a coordinate variation, or kinematical variation, 
which we denote 


ôq Œ) = q(x’) — q(x). (4.75) 
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This is the apparent change in the height of the function when making a shift 
in the coordinates x, or perhaps some other parameter which appears either 
explicitly or implicitly in the action. More generally, the special symbol 4; is 
used for a variation with respect to the parameter £. By changing the coordinates 
in successive variations, ôx, one could explore the entire function g(x) at 
different points. This variation is clearly related to the partial (directional) 
derivative of g. For instance, under a shift 


Re +e", (4.76) 
i.e. dx" = e”, we have 
ôxq (x) = (O q)". (4.77) 
One writes the total variation in the field g as 


ôr =ô + D Sei. (4.78) 


4.5.2 Gauge and vector fields 


The coordinate variation of a vector field is simply 


8, Va = V œ) — Vax) 
= (ð, V,e*. (4.79) 


For a gauge field, the variation is more subtle. The field at position x’ need only 
be related to the Taylor expansion of the field at x up to a gauge transformation, 
so 


5, Ay = A(x’) — A (x) 
= (,A,)e* + 3 (O,s)e*. (4.80) 


The gauge transformation s is important because ô, A,,(x) is a potential differ- 
ence, and we know that potential differences are observable as the electric and 
magnetic fields, so this variation should be gauge-invariant. To make this so, 
one identifies the arbitrary gauge function s by d,s = —A,, which is equally 
arbitrary, owing to the gauge symmetry. Then one has 


ôx Ay = (Ay — Oy A1)? 
Shine: (4.81) 


Neglect of the gauge freedom has led to confusion over the definition of the 
energy-momentum tensor for gauge fields; see section 11.5. 
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The dynamical variation of a vector field follows from the general tensor 
transformation 
Ox? 


Vx’) = Ox/H 


V, (x). (4.82) 


From this we have 
Vu (x) = Vix) — Vue) 
= V; — (Vp) — V Œ) 


ox? 
= ave V(x) — (0, Vue" — Vax) 
= —(0,€,)V" — (Ve. (4.83) 


For the gauge field, one should again be wary about the implicit coordinate 
variation. The analogous derivation gives 


ôA (x) = AŒ) — A(x) 
= A’) — Fine — AL x) 


ax? x 
= gam Se) — Fue — Ay(x) 
= — (ðE) A” — Faye’. (4.84) 


4.5.3 The metric and second-rank tensors 


The coordinate variation of the metric is obtained by Taylor-expanding the 
metric about a point x, 


Ox guv = Bue) = Em (x) 


= (O,8uv(x))e*. (4.85) 
To obtain the dynamical variation, we must use the tensor transformation rule 
Sy) = a T tu; (4.86) 
where 
ie E E OP. (4.87) 
ax” M 
Thus, 


ôg pv = Em) = Suv(x) 
ax Ox? 


E ax’! 3x” Epo) T (Ong) ye? = Suv(x) 
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= —(Agwe” — (pE) gav — (Ove) Bry 
= (dagu) — {Qne. + Hey}, (4.88) 


where one only keeps terms to first order in é”. 


4.6 The value of the action 


There is a frequent temptation to assign a physical meaning to the action, beyond 
its significance as a generating functional. The differential structure of the 
action, and the variational principle, give rise to canonical systems obeying 
conservation laws. This is the limit of the action’s physical significance. The 
impulse to deify the action should be stifled. 

Some field theorists have been known to use the value of the action as an 
argument for the triviality of a theory. For example, if the action has value zero, 
when evaluated on the constraint shell of the system, one might imagine that this 
is problematic. In fact, it is not. It is not the numerical value of the action but its 
differential structure which is relevant. 

The vanishing of an action on the constraint shell is a trivial property of any 
theory which is linear in the derivatives. For instance, the Dirac action and the 
Chern—Simons [12] action have this property. For example: 


s= | anteiy", + my 
ôS 
iT = (iy“d, + m)y =0 


s| =0. (4.89) 
v 

The scalar value of the action is irrelevant, even when evaluated on some speci- 
fied constraint surface. Whether it is zero, or non-zero, it has no meaning. The 
only exception to this is in the Wick-rotated theory, where a serendipitous link 
to finite temperature physics relates the Wick-rotated action to the Hamiltonian 
or energy operator of the non-rotated theory. 


5 


Classical field dynamics 


A field is a dynamically changing potential V(x, +t), which evolves in time 
according to an equation of motion. The equation of motion is a constraint 
on the allowed behaviour of the field. It expresses the dynamical content of the 
theory. The solution of that constraint, called the physical field, is the pivotal 
variable from which we glean all of the physical properties of the system. In 
addition to dynamical equations, a field theory has a conceptual basis composed 
of physical assumptions, interpretations and boundary conditions. 

The familiar equations of motion, in classical field dynamics, include the 
Schrödinger equation, Maxwell’s equations, Dirac’s relativistic equation and 
several others. In the context of field theory, we call such equations classical 
as long as we are not doing quantum field theory (see chapter 15), since the 
method of solution is directly analogous to that of classical electrodynamics. 
In spite of this designation, we know that the solutions of Schrédinger’s field 
equation are wavefunctions, i.e. the stuff of quantum mechanics. Whole books 
have been written about these solutions and their interpretation, but they are not 
called field theory; they use a different name. 

Field theory embraces both quantum mechanics and classical electrodynam- 
ics, and goes on to describe the most fundamental picture of matter and energy 
known to physics. Our aim here is to seek a unified level of description for 
matter and radiation, by focusing on a field theoretical formulation. This ap- 
proach allows a uniquely valuable perspective, which forms the basis for the full 
quantum theory. The equations presented ‘classically’ in this book have many 
features in common, although they arise from very different historical threads, 
but — as we shall see in this chapter — the completeness of the field theoretical 
description of matter and radiation can only be appreciated by introducing 
further physical assumptions brought forcefully to bear by Einsteinian relativity. 
This is discussed in chapter 15. 


72 


5.1 Solving the field equations 73 


5.1 Solving the field equations 


A solution is a mathematical expression of the balance between the freedom 
expressed by the variables of a theory and the constraints which are implicitly 
imposed upon them by symmetries and equations of motion. 

Each physical model has a limited validity, and each has a context into 
which one builds its interpretation. Some solutions must be disregarded on the 
basis of these physical assumptions. Sometimes, additional constraints, such as 
boundary conditions, are desirable to make contact with the real world. The 
basic vocabulary of solutions involves some common themes. 


5.1.1 Free fields 


Free particles or fields do not interact. They experience no disturbances and 
continue in a fixed state of motion for ever. Free particles are generally described 
by plane wave fields or simple combinations of plane waves, which may be 
written as a Fourier transform, 


® a ate ikx (k) 5.1 
(x)= Gayri? (k), (65.1) 


or, using Schwinger’s compact notation for the integration measure, as 


(x) = J (dk) el @(k). (5.2) 


For this combination to satisfy the field equations, we must add a condition 
x(k) = 0, which picks out a hyper-surface (a sub-set) of all of the k, which 
actually satisfy the equations of motion: 


(x) = J (dhje o, (080, (5.3) 


where x = 0 is the constraint imposed by the equations of motion on k. Without 
such a condition, the Fourier transform can represent an arbitrary function. 
Notice that ® (k) and ®, (k) have different dimensions by a factor of k due to the 
delta function. This condition x is sometimes called the mass shell in particle 
physics. Elsewhere it is called a dispersion relation. Fields which satisfy this 
condition (i.e. the equations of motion) are said to be on shell, and values of k 
which do not satisfy this condition are off shell. For free fields we have 


xR = fh? (—o? + k’c?) +m cI = 0 
h?k? 
= a 2S 0, (5.4) 


for the relativistic and non-relativistic scalar fields, respectively. The delta- 
function constraint ensures that the combinations of plane waves obey the 
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field equations. It has the additional side effect that one component of the 
wavenumber k,, is not independent and can be eliminated. It is normal to 
integrate over the zeroth (energy) component to eliminate the delta function. 
From Appendix A, eqn. (A.15), we have 


—1 
(x) = J (dk) E el Kx- 0) D (k, w(k)). (5.5) 


Travelling waves carry momentum k; > 0 or k; < 0, while stationary waves 
carry no momentum, or rather both k; and —k; in equal and opposite amounts. 


5.1.2 Boundary conditions and causality I 


A common strategy for simplifying the analysis of physical systems is to assume 
that they are infinitely large, or that they are uniform in space and/or time, or that 
they have been running uniformly in a steady state for ever. Assumptions like 
this allow one to do away with the complicated behaviour which is associated 
with the starting up or shutting down of a dynamical process. It also allows 
one to consider bulk behaviour without dealing with more difficult effects in the 
vicinity of the edges of a system. Some of the effects of finite size and starting 
up/shutting down can be dealt with by imposing boundary conditions on the 
behaviour of a system. The term boundary conditions is used with a variety of 
meanings. 


e Boundary conditions can be a specification of the absolute value of the 
field at some specific spacetime points, e.g. 


p(x) = 0. (5.6) 


This indicates a constraint associated with some inhomogeneity in space- 
time. 


e Acorollary to the above is the specification of the value of the field on the 
walls of a container in a finite system. 


e At junctions or interfaces, one is interested in continuity conditions, like 
those derived in section 4.1.4 and generalizations thereof. Here, one 
matches the value of the field, perhaps up to a symmetry transformation, 
across the junction, e.g. 


A$ (xo) = 9, (5.7) 


meaning that the field does not change discontinuously across a junction. 
Conditions of this type are sometimes applied to fields, but usually it 
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is more correct to apply them to conserved quantities such as invariant 
products of fields, probabilities 


A (wiv) =0, (5.8) 


etc. since fields can undergo discontinuous phase changes at boundaries 
when the topology of spacetime allows or demands it. 


e Related to the last case is the issue of spatial topology. Some boundary 
conditions tell us about the connectivity of a system. For example, a field 
in a periodic lattice or circle of length L could satisfy 


o(x+L)=U(L) (2). (5.9) 


In other words, the value of the field is identical, up to a possible phase or 
symmetry factor U (L), on translating a distance L. 


e Another kind of condition which one can impose on a reversible physical 
system is a direction for causal development. The keywords here are 
advanced, retarded and Feynman boundary conditions or fluctuations. 
They have to do with a freedom to change perspective between cause and 
effect in time-reversible systems. Is the source switched on/off before 
or after a change in the field? In other words, does the source cause 
the effect or does it absorb and dampen the effect? This is a matter 
of viewpoint in reversible systems. The boundary conditions known as 
Feynman boundary conditions mix these two causal perspectives and 
provide a physical model for fluctuations of the field or ‘virtual particles’: 
a short-lived effect which is caused and then absorbed shortly afterwards. 


5.1.3 Positive and negative energy solutions 


The study of fields in relativistic systems leads to solutions which can be 
interpreted as having both positive and negative energy. Free relativistic field 
equations are all transcriptions of the energy relation 


E=+, pc? + m?ct, (5.10) 


with the operator replacement p, = —ifid, and a field on which the operators 
act. This is most apparent in the case of the Klein—Gordon equation, 


(A? + m’c*)o(x) = 0. (5.11) 


Clearly, both signs for the energy are possible from the square-root in 
eqn. (5.10). The non-relativistic theory does not suffer from the same problem, 
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since the Schrédinger equation is linear in the energy and the sign is defined to 
be positive: 


—=E. 5.12 
m (5.12) 


The field @(x) can be expanded as a linear combination of a complete set of 
plane wavefunctions satisfying the equation of motion. The field can therefore 
be written 


ox) = J (dk) (k)e™*8 (h?c?k? + m’c'), (5.13) 


where ¢ (k) are arbitrary coefficients, independent of x. The integral ranges over 
all energies, but one can separate the positive and negative energy solutions by 
writing 


G(x) =P a) + 9O]), (5.14) 


where 
d(x) = J (dk) (k)e* 0 (ko)ô (A7c7k? + m’c*) 
CDs J (dk) (ke 0 (—ko)ô (A7c7k? + mc") . (5.15) 


The symmetry of the energy relation then implies that 


a= yna: (5.16) 


The physical interpretation of negative energy solutions is an important issue, 
not because negative energy is necessarily unphysical (energy is just a label 
which embraces a variety of conventions), but rather because there are solutions 
with arbitrarily large negative energy. A transition from any state to a state with 
energy E = —oo would produce an infinite amount of real energy for free. This 
is contrary to observations and is, presumably, nonsense. 

The positive and negative energy solutions to the free relativistic field equa- 
tions form independently complete sets, with respect to the scalar product, 


(@ (x), 6 (x)) = const. 

(@ (x), 6 (x)) = const. 

OPA, pO = 0. (5.17) 
In the search for physically meaningful solutions to the free relativistic equa- 
tions, it might therefore be acceptable to ignore the negative energy solutions 


on the basis that they are just the mirror image of the positive energy solutions, 
describing the same physics with a different sign. 
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This is the case for plane waves, or any solutions which are translationally 
invariant in time. Such a wave has a time dependence of the form, 


b(t) ~ exp (-i5 = w) (5.18) 


where tọ is an arbitrary origin for time. If E < 0, one can simply recover a 
positive energy description by moving the origin for time tọ into the far future, 
to — œ, which essentially switches £ —> —t. Since a free particle cannot 
change its energy by interaction, it will always have a definite energy, either 
positive or negative. It cannot therefore extract energy from the field by making 
a transition. 

The real problem with negative energies arises in interacting theories. Itis not 
clear how to interpret these solutions from the viewpoint of classical field theory. 
An extra assumption is needed. This assumption is more clearly justified in the 
quantum theory of fields (see chapter 15), but is equally valid in the classical 
theory. The assumption is that there exists a physical state of lowest energy 
(called the vacuum state) and that states below this energy are interpreted as 
anti-matter states. 

It is sometimes stated that relativistic quantum mechanics (prior to second 
quantization) is sick, and that quantum field theory is required to make sense 
of this problem. This is not correct, and would certainly contradict modern 
thinking about effective field theories.' All that is required is a prescription for 
interpreting the negative energies. The assumptions of quantum field theory, 
although less well justified, are equally effective and no more arbitrary here. In 
fact, they are essential since the classical field theory is a well defined limit to 
the fully quantized field theory. 


5.1.4 Sources 


The terms source and current are often used interchangeably in field theory, 
but they refer to logically distinct entities. Sources (sometimes referred to 
emphatically as external sources) are infinitesimal perturbations to a physical 
system; currents represent a transfer between one part of a system and another. 
In an isolated (closed) system, matter and energy can flow from one place to 
another, and such currents are conserved. There is a close formal similarity 
between sources and currents, which is no accident. Sources — and their 
opposites: sinks — can be thought of as infinitesimal currents which are not 
conserved. They represent the flow of something into or out of a physical 
system, and thus a perturbation to it. Sources are also the generators of 
infinitesimal field changes, called virtual processes or fluctuations. 


' Certain specific Lagrangians lead to unphysical theories, but this is only a reason to reject 
certain models, not the quantum theory itself. 
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In mathematics, any quantity on the ‘right hand side’ of a field equation is 
called a source, ‘forcing term’ or ‘driving term’. A source perturbs or drives the 
field linearly. For example, consider the Klein—Gordon equation 


mc? 
(- a Joona (5.19) 


One says that J (x) is a source for the field (x). J is sometimes also referred 
to as a generalized force. Sources are included in the action in the form 


S>S+ [eso (5.20) 
For example, the Klein—Gordon action with a source term becomes 
S= fæ {SMe O"O) A) + Smeg? — sol . (5.21) 
When this action is varied, one obtains 
> = (Pn +m’c*) ¢-J =0, (5.22) 


which leads directly to eqn. (5.19). Other source terms include 


SMaxwell —> SMaxwell + J (dx) JA, (5.23) 


for the electromagnetic field, and 


Scomplex > Scomplex F fæ {J¢* T J*o} (5.24) 


for a complex scalar field. Most interactions with the field do not have the form 
of an infinitesimal perturbation. For instance, the interaction with a Schrédinger 
field, in quantum mechanics, has the form w*V yw, making J = Vw, which is 
not infinitesimal. However, if one assumes that V is small, or infinitesimal, then 
this may be expanded around the field y for a free theory in such a way that 
it appears to be a series of infinitesimal impulsive sources; see section 17.5. In 
this way, the source is the basic model for causal change in the field. 
Another definition of the source is by functional differentiation: 


—=Ja, (5.25) 
where @ is a generic field. This is a generic definition and it follows directly 


from eqn. (5.20), where one does not treat the source term as part of the action 
S. 
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A current represents a flow or transport. To define current, one looks to the 
only example of current known prior to field theory, namely the electric current. 
Recall Maxwell’s equation 


3p F” = pod”. (5.26) 


The quantity J„ is the (n + 1) dimensional current vector. It is known, 
from the microscopics of electromagnetism, that this is the electric current: 
electric currents and electric charges are responsible for the electromagnetic 
field. However, one may also say that J, is a source for the electromagnetic 
field, because it prevents the left hand side of this equation from being equal to 
zero. It perturbs the equation of motion. In electromagnetism the current is a 
source for the field F,» or Ap, so it is common to treat source and current as 
being the same thing. This tendency spills over for other fields too, and one 
often defines a generic current by eqn. (5.25). Of course, normally one imagines 
a current as being a vector, whereas the quantity in eqn. (5.25) is a scalar, but 
this may be used as a definition of ‘current’. The notion of conserved currents 
and their relation to symmetries recurs in chapter 9. 


5.1.5 Interactions and measurements 


Fields undergo interactions with other fields, and perhaps with themselves 
(self-interaction). When fields interact with other fields or potentials (either 
static or dynamical), the state of the field is modified. Classically, the field 
responds deterministically according to a well defined differential equation 
(the equation of motion), and interactions apply new constraints. One way to 
understand weakly interacting systems is to imagine them to be assemblies 
of weakly-coupled oscillators. In special circumstances, it is possible to 
construct models with interactions which can be solved exactly. Often, however, 
approximate methods are required to unravel the behaviour of interacting fields. 

In quantum mechanics the act of measurement itself is a kind of temporary 
interaction, which can lead to a discontinuous change of state. It is not funda- 
mentally different from switching on a potential in field theory. The ‘collapse of 
the wavefunction’ thus occurs as a transition resulting from an interaction with a 
measurement apparatus. This collapse has no detailed description in the theory. 


5.2 Green functions and linear response 
5.2.1 The inverse problem 


Consider an equation of the form 


DyO=fO, (5.27) 
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where D is a differential operator, y(t) is a variable we seek to determine, and 
f (t) is some forcing term, or ‘source’. We meet this kind of equation repeatedly 
in field theory, and D is often an operator of the form D = -0O + m?. 

Normally, one would attempt to solve a differential equation either by 
integrating it directly, or by ‘substituting in’ a trial solution and looking for 
consistency. An alternative method is the method of Green functions. The idea 
can be approached in a number of ways. Let us first take a naive approach. 

If D is an operator, then, if a unique solution to the above equation exists, 
it must have an inverse. We can therefore write the solution to this equation 
formally (because the following step has no meaning until we have defined the 
inverse) by 


yt) = (Df) = z (5.28) 


This is much like the approach used to solve matrix equations in linear algebra. 
Both the notations in the equation above are to be found in the literature. If the 
inverse exists, then it must be defined by a relation of the form 
P Dp =] (5.29) 
n7 Sh : 
where 7 is the identity operator. We do not yet know what these quantities 
are, but if an inverse exists, then it must be defined in this way. An obvious 
thing to notice is that our eqn. (5.27) is a differential equation, so the solution 
involves some kind of integration of the right hand side. Let us now postpone the 
remainder of this train of thought for a few lines and consider another approach. 
The second way in which we can approach this problem is to think of 
eqn. (5.27) as a ‘linear response’ equation. This means that we think of the right 
hand side as being a forcing term which perturbs the solution y(t) by kicking it 
over time into a particular shape. We can decompose the force f(t) into a set of 
delta-function impulse forces over time, 


f= J arsa, ty) f(t’). (5.30) 


This equation, although apparently trivial (since it defines the delta function), 
tells us that we can think of the function f(t) as being a sum of delta functions 
at different times, weighted by the values of f(t’). We can always build up a 
function by summing up delta functions at different times. In most physical 
problems we expect the value of y(t) to depend on the past history of all the 
kicks it has received from the forcing function f(t). This gives us a clue as to 
how we can define an inverse for the differential operator D. 


2 Note that the ordering of the operator and inverse is an issue for differential operators. We 
require a ‘right-inverse’, but there may be no left inverse satisfying DID =]. 
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Suppose we introduce a bi-local function G (t, t’), such that 


y(t) = J di’ GC, 1) fO): (5.31) 


i.e. when we sum up the contributions to the force over time with this weight, 
it gives us not the force itself at a later time, but the solution. This, in fact, is 
the way we define the inverse D~!. It has to be a bi-local function, as we shall 
see below, and it involves an integration, in spite of the purely formal notation 
in eqn. (5.29). 

Substituting this trial solution into the equation of motion, we have 


D fa G(t,t’) f(t!) = f(t), (5.32) 


where the operator D acts on the variable t only, since the dummy variable t’ is 
integrated out from minus to plus infinity. Thus, we may write, 


[or bounte = ro. (5.33) 


This equation becomes the defining equation for the delta function (5.30) if and 
only if 


D Git, t) = 8t, t’), (5.34) 


and this equation is precisely of the form of an inverse relation, where the delta 
function is the identity operator. We have therefore obtained a consistent set of 
relations which allow us to write a formal solution y(t) in terms of an inverse for 
the operator G (t, t’); we also have an equation which this inverse must satisfy, 
so the problem has been changed from one of finding the solution y(t) to one of 
calculating the inverse function. It turns out that this is often an easier problem 
than trying to integrate eqn. (5.27) directly. 

The function G (t, t') goes by several names. It is usually referred to as the 
Green(’s) function for the operator D, but it is also called the kernel for D and, 
in quantum field theory, the propagator. 

We can, of course, generalize this function for differential operators which 
act in an (n + 1) dimensional spacetime. The only difference is that we replace 
t, t' by x, x’ in the above discussion: 


D y(x) = f(x) 
DG(x, x’) = c8(x, x’) 


y(x) = Janseno, (5.35) 
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Or, equivalently, 
DG(x, x^) = (x, x’)d(t, t”) 
yx) = fase, xV fN. (5.36) 


We are not quite finished with Green functions yet, however: we have skirted 
around an important issue above, which is described in the next section. 


5.2.2 Boundary conditions and causality II 


The discussion above is not quite complete: we have written down a function 
which relates the solution at x to a forcing term at x’ via a bi-local function 
G(x, x’). The inverse relation involves an integral over all intermediate times 
and positions x’, but over what values does this integral run? And over what 
values of x’ was the force defined? Was it switched on suddenly at some time 
in the past (giving an integral from a fixed time in the past to the present), or 
has it always existed (giving an integral from minus infinity)? Moreover, why 
should x’ be in the past? We know that physics is usually time-reversible, so 
why could we not run time backwards and relate a solution in the past to a value 
of the force in the future, or perhaps a combination of the past and future? 

All of these things are possible using different Green functions. We therefore 
see that the inverse is not unique, and it is not unique because the definition of 
the inverse involves an integration, and integrals have limits. Physically we are 
talking about the need to specify initial or boundary conditions on our physical 
system. 

The commonly used Green functions are as follows. 


e Retarded Green function G,(x, x’). This relates a solution at the present 
to forces strictly in the past. It is the basis of linear response theory. Due to 
its origins in electromagnetism, it is often referred to as the susceptibility 
x(x, x’) = x’ + ix” in other books, with real and imaginary parts as 
denoted. 


e Advanced Green function G,(x, x’). This relates a solution at the present 
to forces strictly in the future. 


e Feynman Green function Gp(x, x’). This relates a solution at the present 
to forces disposed equally in the past and the future. Its interpretation 
is rather subtle, since it turns real fields into complex fields as they 
propagate. The Feynman Green function is a correlation function, and 
a model for fluctuations in a system. It is sometimes denoted A(x, x’), 
C(x, x’) or S(x, x’) in other books. 


5.2 Green functions and linear response 83 


e Wightman functions. The positive and negative frequency Wightman 
functions G® (x, x’) may be thought of as building blocks out of which 
all the other Green functions may be constructed. 


5.2.3 Green functions in Fourier momentum space? 


A useful way of calculating quantities is to use an integral transformation, 
usually the Fourier transformation on the Green functions. The purpose of this 
step is to turn an operator equation into an ordinary algebraic equation, plus a 
single integral. This is often referred to as transforming into ‘momentum space’, 
since the choice of units makes the Fourier transform variables equivalent to 
momenta. 

We shall focus largely on the Green functions for the scalar field, since most of 
the Green functions for other fields can be obtained from this by differentiation. 
We are looking to solve an equation of the form 


(—O + M’)G(x, x’) = 6(x, x’), (5.37) 


where M°? is some real mass term. We define the Fourier transforms of the Green 
function by the mutually inverse relations, 


G(r) = J (dk)e*” G (k) (5.38a) 
cw = [lane Ger... (5.38) 
where we have assumed that G(r) = G(x, x’) is a translationally invariant 


function of the coordinates (a function only of the difference x — x’), which is 
reasonable since M? is constant with respect to x. We shall also have use for the 
Fourier representation of the delta function, defined in Appendix A, eqn. (A.10). 
Notice how the Fourier integral is a general linear combination of plane waves 
exp(ik(x — x’)), with coefficients G(k). Using this as a solution is just like 
substituting complex exponentials into differential equations. Substituting these 
transformed quantities into eqn. (5.37), and comparing the integrands on the left 
and right hand sides, we obtain 


(k? + M’)G(k) = 1. (5.39) 


This is now an algebraic relation which may be immediately inverted and 
substituted back into eqn. (5.38b) to give 


eke —x’) 


——— 4 
K +M? ai 


G(x, x) = faw 


3 Tn this section we set A = c = 1 for convenience. 
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In addition to this ‘particular integral’, one may add to this any linear combina- 
tion of plane waves which satisfies the mass shell constraint k? + M? = 0. Thus 
the general solution to the Green function is 


Gx(x, x) = J (dk)e* = r + X(k, x) 8(k? + | , (5.41) 
where X (k, x) is an arbitrary function of k, and in the unusual case of inhomo- 
geneous systems it can also depend on the average position x = 5 (x + x’). 
This arbitrariness in the complementary function is related to the issue of 
boundary conditions in the previous section and the subsequent discussion in 
the remainder of this chapter, including the choice of integration path for the 
Green function. In most cases studied here, X(k,x) = 0, and we choose a 
special solution (retarded, advanced, etc.) for the Green function. This term 
becomes important in satisfying special boundary conditions, and occurs most 
notably in statistical ‘many-particle’ systems, which vary slowly with t away 
from equilibrium. 

We are therefore left with an integral which looks calculable, and this is 
correct. However, its value is ambiguous for the reason mentioned above: 
we have not specified any boundary conditions. The ambiguity in boundary 
conditions takes on the form of a division by zero in the integrand, since 


k? + M? = -ki +k? + M? = (ax — ko) (@x + ko), (5.42) 
where œg = v k? + M?. This G(k) has simple poles at 
ko = toz. (5.43) 


In order to perform the integral, we need to define it unambiguously in the 
complex plane, by choosing a prescription for going around the poles. It 
turns out that this procedure, described in many texts, is equivalent to choosing 
boundary conditions on the Green function. 


5.2.4 Limitations of the Green function method 


The Green function method nearly always works well in field theory, but it is 
not without its limitations. The limitations have to do with the order of the 
differential operator, D, the number of spacetime dimensions and whether or 
not the operator contains a mass term. For a massive operator 


CO + M*)o(x) = J(x), (5.44) 


the general solution is given by 


d(x) = fæ G(x, x')J(x’). (5.45) 
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For a massless field, it is clear that one can always add to this a polynomial of 
order lower than the order of the differential operator. In the example above, 
setting M = 0 allows us to add 


o(x) = fæ G(x, xNJ (x) + a(x — x") + B. (5.46) 


A more serious limitation of the Green function method arises when the order of 
the differential operator exceeds the number of spacetime dimensions involved 
in the operator. This leads to non-simple poles in the Green function, which 
presents problems for the evaluation of the Green function. For example, a 
second-order operator in one dimension 


a7G(t, t) = 6(t, t’). (5.47) 


If we try to solve this using the Fourier method, we end up with an integral of 
the form 


dæ e iat") 
G0, t) = | —-———_... 5.48 
G0) i —(w tie)? Oe) 
This integral has a second-order pole and cannot be used to solve an equation 
involving 07. For example, the equation for the position of a Newtonian body 


a?x(t) = F/m, (5.49) 


cannot be solved in this way since it is not homogeneous in the source F/m. 
The solution is easily obtained by integration 


IF 3 
x(t) = =—t° + vt + x0. (5.50) 
2m 


Since there are terms in this solution which are not proportional to F/m, it is 
clear that the Green function method cannot provide this full answer. However, 
the equation can still be solved by the Green function method in two stages. 


5.2.5 Green functions and eigenfunction methods 


In introductory quantum mechanics texts, the usual approach to solving the 
system is based on the use of the eigenfunctions of a Hamiltonian operator. 
This is equivalent to the use of Green functions. The Fourier space expressions 
given thus far assume that an appropriate expansion can be made in terms of 
plane wave eigenfunctions: 


u(x) = e. (5.51) 
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Written in this notation, the Green functions have the form 


CES) Gre). (5.52) 


where the u, are a complete set of eigenfunctions, or solutions of the field 
equations, and the G, are a set of constants in this new expansion. The labels n 
are sometimes discrete (as in bound state problems) and sometimes continuous, 
as in the case n = k, G(k) and so on. In addition to the above expansion, the 
question of boundary conditions must be addressed. This can be accomplished 
by multiplying the coefficients by step functions: 


Gn (x, x’) & (æn A(t — t) + B, OC = t). (5.53) 


This is true in many situations, at least when the system concerned is transla- 
tionally invariant. However, in bound state problems and situations of special 
symmetry, this expansion leads to an inefficient and sometimes pathological 
approach. 

Consider the relativistic scalar field as an example. The complex scalar field 
satisfies the equation 


(-0 +m +V)$(x) =J). (5.54) 


Now let ¢g, be a complete set of eigenfunctions of the operator in this equation, 
such that a general wavefunction ¢(x) may be expanded in terms of a complete 
set of these with coefficients c,,, 


(x) = Yo npr Q), (5.55) 
such that 


J TA EEE M (5.56) 


t=t' 


The wavefunction @(x) and the eigenfunctions n(x) are assumed to be one- 
particle wavefunctions. The discrete indices n,m denote any bound state 
quantum numbers which the wavefunction might have. The eigenfunctions 
satisfy 


(0 +m? + V) g(x) =0. (5.57) 


The eigenfunctions can also be expressed in terms of their positive and negative 
frequency parts, 


Pr = PP (x) +OK), (5.58) 
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where gP (x) = (PP (x))*, 
HPO = f (ake 6(—ko) 602 +m? + Van), 5.59) 


and a,(k) is a c-number. The Green function for the field (wavefunction) ¢ (x) 
is the inverse of the operator in eqn. (5.54), satisfying, 


(—O +m? + V) Gam (x, x’) = ônmô (x, x’). (5.60) 


Using eqn. (5.57) and eqn. (A.21) from Appendix A, we can solve this equation 
with an object of the form 


Gam = (a O0 =t) + BOW —1) X ana’), (5.61) 


where a and £ are to be fixed by the choice of boundary conditions on the Green 
function. 


5.3 Scalar field Green function 


The Green function for the scalar field is defined by the relation 


(A? +m’c*)G(x, x’) = 6(x, xel, t). (5.62) 


It is often convenient to express this in terms of the (n + 1) dimensional delta 
function 


8(x, x’)5(t, t) = c8(x, x’)8(x°, x") = cê (x, x’). (5.63) 


The right hand side of eqn. (5.62) differs from an (n + 1) dimensional delta 
function by a factor of c because the action is defined as an integral over 
dV, = (dx) rather than dV,. This convention is chosen because it simplifies 
the coupling between matter and radiation, and because it makes the Lagrangian 
density have the dimensions of an energy density. In natural units, A = c = 1, 
this distinction does not arise. The formal expression for the scalar Green 
function on solving this equation is 


eka —x') 


G(x, x) = e faw- (5.64) 


where p, = fik,. Thus, G(x, x’) has the dimensions of ¢?(x). This Green 
function can be understood in a number of ways. For the remainder of this 
section, we shall explore its structure in terms of the free-field solutions and the 
momentum-space constraint surface p?c? + m?c* = 0, which is referred to in 
the literature as the ‘mass shell’. 
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5.3.1 The Wightman functions 


It is useful to define two quantities, known in quantum field theory as the positive 
and negative frequency Wightman functions, since all the Green functions can 
be expressed in terms of these. The Wightman functions are the solutions to the 
free differential equation,* 


(AO +m AGP (x, x’) = 0. (5.65) 


For convenience, it is useful to separate the solutions of this equation into 
those which have positive frequency, ko = |w,|, and those which have negative 
frequency, ko = —|w,|. They may be written by inspection as a general linear 
combination of plane waves, using a step function, 6 (Eko), to restrict the sign 
of the frequency, and a delta function to ensure that the integral over all k is 
restricted only to those values which satisfy the equations of motion, 


GW (x, x) = —2ric J (dk)e**—-* 6 (—ko)ô (p%c? + m?c4) 


GO (x, x’) = 2ric J (dk)e*—-* 6 (ko) 5(p2c? + m?c*). (5.66) 


Because of unitarity,” these two functions are mutually conjugate (adjoint) in 
the relativistic theory. 


GP, x) = [GO x] = -GO@’, x). (5.67) 


In the non-relativistic limit, field theory splits into a separate theory for particles 
(which have positive energy) and for anti-particles (which have negative energy). 
Although this relation continues to be true, when comparing the particle theory 
with the anti-particle theory, it is not true for straightforward Schrödinger theory 
where the negative frequency Wightman function is zero at zero temperature. 

The delta function in the integrands implies that one of the components of the 
momentum is related to all the others,° thus we may integrate over one of them, 
ko, in order to eliminate this and express it in terms of the others. The equations 
of motion tell us that ckg = +œ, where 


ha, = y A?k2c? + m2c4, (5.68) 


i.e. there are two solutions, so we may use the identity proven in eqn. (A.15) to 
write 


1 
8(p2c? + m2c) = ——— |s (- + rl) +ô (x + al (5.69) 
2h-c*|@,| c c 


4 They are analogous to the complementary function in the theory of linear partial differential 
equations. 

5 Unitarity is the property of field theories which implies conservation of energy and probabili- 
ties. 

6 The momentum is said to be ‘on shell’ since the equation, k?4+m? = 0, resembles the equation 
of a spherical shell in momentum space with radius im. 
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This relation is valid under the integral sign for kg. Noting that the step 
functions, 6(+ko), pick out only one or the other delta function on the right 
hand side, we have 


GO (x, x’) = —2ri (fc) J © Fide) lox)" 
27m 2k 

(dk) 1 

27m Zw, 


GO (x, x’) = 2i (We)! eik ax Henle) 


dk) 1 : : , 
= 2ri (ho! œ nl =o) 
k 
(5.70) 


Before leaving this section, we define two further symbols which appear in field 
theory, 


Gx, x’) = GO, x) + GO, x’) 
G(x, x) = GP (x, x) — GO (x, x’). (5.71) 


G(x, x’) is the sum of all solutions to the free-field equations and, in quantum 
field theory, becomes the so-called anti-commutator function.’ Note that 
this quantity is explicitly the sum of G(x, x’) and its complex conjugate 
G(x, x’) and is therefore real in the relativistic theory.* 

The symmetric and anti-symmetric combinations satisfy the identities 


a GG,x’)| =0 (5.72) 


t=’ 


and 


5 G(x.x)) = 8(x, x’). (5.73) 


t=t' 


The latter turns out to be equivalent to the fundamental commutation relations 
in the quantum theory of fields. G(x, x’) becomes the commutator function in 
the quantum theory of fields. 


7 This looks wrong from the definitions in terms of Green functions, but recall the signs in the 
definitions of the Green functions. The tilde denotes the fact that it is a commutator of the 
quantum fields in the quantum theory. 

8 This symmetry is broken by the non-relativistic theory as G(x, x’) vanishes at the one- 
particle level. 
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Finally, we may note that œ is always positive, since it is the square-root of a 
positive, real quantity, so we may drop the modulus signs in future and take this 
as given. 


5.3.2 Boundary conditions and poles in the ky plane 


When solving differential equations in physics, the choice of boundary con- 
ditions normally determines the appropriate mixture of particular integral and 
complementary functions. The same is true for the Green function approach, but 
here the familiar procedure is occluded by the formalism of the Green function. 

The Wightman functions are the general solutions of the free-field equations: 
they are the complementary functions, which one may always add to any 
particular integral. There are two ways to add them to a special solution. One 
is to use the term X in eqn. (5.41); the other is to deform the complex contour 
around the poles. This deformation accomplishes precisely the same result as 
the addition of complementary solutions with complex coefficients. Let us now 
consider how the deformation of the complex contour leads to the choice of 
boundary conditions for the field. 

The retarded, advanced and Feynman Green functions solve the equations 
of motion in the presence of a source, with specific boundary conditions as 
mentioned in section 5.2.2. In this section, we shall impose those boundary 
conditions and show how this leads to an automatic prescription for dealing 
with the complex poles in the integrand of eqn. (5.40). The most intuitive way 
of imposing the boundary conditions is to write the Green functions in terms of 
the step function: 


G,(x, x’) = —0 (0, o')G(x, x’) (5.74a) 
Ga(x, x’) = 0(0', o)G(x, x’) (5.74b) 
G(x, x’) = —0 (0, 0. )G™ (x, x) + O(0', oG (x, x"). (5.740) 


Note that, since the retarded and advanced Green functions derive from G (x, x’), 
they are real in x,x’ space (though this does not mean that their Fourier 
transforms are real in k space), except in the non-relativistic theory. When 
we write 0(0, o’) in this way, the o’s usually refer to two time coordinates 
A(t, t’), but in general we may be measuring the development of a system 
with respect to more general spacelike hyper-surfaces, unconnected with the 
Cartesian coordinate t or x°. For simplicity, we shall refer to £ and f’ in 
future. The physical meaning of these functions is as advertised: the retarded 
function propagates all data from earlier times to later times, the advanced 
function propagates all data from future times to past times, and the Feynman 
function takes positive frequency data and propagates them forwards in time, 
while propagating negative frequency data backwards in time. 
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To convert these expressions into momentum-space integrals, we make use of 
the integral representations of the step function, 


© day emiel’) 

ot- =ilim | — ——— 
e>0J» 20 at+ie 

œ day evia(t—1’) 

oC- = ilim | Z2, 
€>0 Joo 2m a—ie 


(5.75) 
Writing Ax = x — x’ for brevity, we can now evaluate these expressions using 
the momentum-space forms for the Wightman functions in eqn. (5.70). 

To evaluate the Green functions in momentum-space, it is useful to employ 
Cauchy’s residue theorem, which states that the integral around a closed (anti- 
clockwise) circuit of a function equals 27 times the sum of the residues of the 
function. Suppose the function ¢ (z) has simple poles in the complex plane at z;, 
then, assuming that the closed contour is in the anti-clockwise (positive) sense, 
we have 


f bode = 21 De-as] (5.76) 
c i 


Z= 


If the contour C is in the clockwise sense, the sign is reversed. 

The complex contour method for evaluating integrals is a useful tool for 
dealing with Green functions, but one should not confuse the contours with the 
Green functions themselves. The Green functions we seek are only defined 
on the real axis, but Cauchy’s formula only works for a closed contour with 
generally complex pieces. We can evaluate integrals over any contour, in order 
to use Cauchy’s formula, provided we can extract the value purely along the 
real axis at the end. The general strategy is to choose a contour so that the 
contributions along uninteresting parts of the curve are zero. 


5.3.3 Retarded Green function 


Let us begin with the retarded (causal) Green function, sometimes called the 
susceptibility x, and write it as an integral expression in k space. We substitute 
the integral expressions in eqn. (5.75) into eqn. (5.70) and eqn. (5.74a), giving 


2m f dæ e^ f (dk) [eik4x-erc AD gikAxtax An) 
Gi, x) =- | | = 
Res 2m Orie 2x 20k 2o 
1 (dk)da [ etKAx-@xtaAt) — gi(kAx—(a—ax) Ar) 
We (27) | 2a,.(a+ie)  2ala +ie) | i 


(5.77) 
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We now shift a > a — a, in the first term and a —> a@ + œx in the second term. 
This gives 
d"kda el(kAx—aAr) 


G, , $ H h2 —1 
(x, x) (Ac) Ory da, 


| : — — f =|. (5.78) 
(a—a,+ie) (œ +æ +ie) 


Re-labelling œ — ko and combining the partial fractions on the right hand side, 
we are left with, 


l 1 
Glx.x') = (Poy! | (dk 6) 
—(ko + ie)? + wk 
or to first order, re-defining € > €/2, 
l 1 
G,(x, x’) =e fai etno (5.80) 
p^c4 + m~<*c* — 1po€ 


This is the significant form we have been looking for. It may be compared 
with the expression in eqn. (5.40), and we notice that it reduces to eqn. (5.40) 
in the limit € — 0. What is important is that we now have an unambiguous 
prescription for dealing with the poles: they no longer lie in the real ko axis. If 
we examine the poles of the integrand in eqn. (5.79) we see that they have been 
shifted below the axis to 


cko = +o, = i€; (5.81) 


see figure 5.1. An alternative and completely equivalent contour is shown in 
figure 5.2. In this approach, we bend the contour rather than shift the poles; the 
end result is identical. 

This ie prescription tells us how to avoid the poles on the real axis, but it 
does not tell us how to complete the complex contour. Although the result we 
are looking for is equal to the value of the integral along the real axis only, 
Cauchy’s theorem only gives us a prescription for calculating an integral around 
a closed contour, so we must complete the contour by joining the end of the real 
axis at +00 and —oo with a loop. After that, we extract the value of the portion 
which lies along the real axis. 

The simplest way to evaluate the contribution to such a loop is to make it a 
semi-circle either in the upper half-plane or in the lower half-plane (see figure 
5.2). But which do we choose? In fact, the choice is unimportant as long as we 
can extract the part of integral along the real axis. 


Evaluation around two closed loops We begin by writing the integrals piece- 
wise around the loop in the complex ko plane. It is convenient to use w = koc as 


5.3 Scalar field Green function 93 


Fig. 5.1. Contour in the complex plane for the retarded Green function with poles 
shifted using the ie prescription. 


Fig. 5.2. Contour in the complex plane for the retarded Green function. 
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the integration variable, since this is what appears in the complex exponential. 
The contour in figure 5.1 has the simplest shape, so we shall use this as our 
template. We write eqn. (5.79) schematically: the integral over œw is written 
explicitly, but we absorb all the remaining integrals and the integrand into an 
object which we shall call G{(k) to avoid clutter; 


y 1 TOS : 1 
f doia = J dwe™™® 0 Gi (k) 


[0,0] 


+ l dwe-G' (k), (5.82) 
SC 


where the first term on the right hand side is the piece we wish to find and the 
second term is the contribution from the semi-circle. 

By Cauchy’s theorem, the value of the left hand side is equal to 27i times the 
sum of the residues of the integrand which are enclosed by the contour. Since all 
of the poles lie in the lower half-plane, the left hand side is zero if we complete 


in the upper half-plane. In the lower half-plane it is 
, ; d’k 
f deve io! IGI(k) = —2ni (hc) J 


(2m)”+! X 


ei Axton At) pilk Ax—wx Ar) 
Re-labelling k — —k in the first term and using 
e* — e™ = 2i sin(x), (5.84) 
we have (At > 0) 
f iowa = J to a Le 1 685) 
à (27)" Wk 


This is clearly real. 


Semi-circle in the upper half-plane The integral around the semi-circle in the 
upper half-plane can be parametrized using polar coordinates. We let 


w =re” =r(cos6 +isin0), (5.86) 


so that, 
. 1 A : : bat , $ 
K de~e- )G!(k) = Í ire’ dg ei (cos +i sin 6) (t—t Gi (re) 


m 
. 1 = N ls at _ 7! . 
= ire’dd e ir cos (t Mer sind G' (re). 
0 


(5.87) 
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Note what has happened here. The imaginary component from the semi-circle 
(the contribution involving sin 0 (t — t’)) has created a real exponential. This real 
exponential causes the integrand to either blow up or decay to zero at r = œ, 
depending on the sign of the sin 0 (t — t’) term. So we have two cases: 


i dwe?'— G! (k) = 0 (t —t' <0) 
SC 
=) (t-t > 0). (5.88) 


In the first case, in which we do not expect the retarded function to be defined, 
the integral over the semi-circle vanishes. Since the complete integral around 
the loop also vanishes here, the real axis contribution that we are looking for 
(ooking at eqn. (5.82)), must also be zero. In the second case, the contribution 
from the loop is difficult to determine, so the contribution to the real axis part, 
from eqn. (5.82) is also difficult to determine. In fact, we cannot derive any 
useful information from this, so for t — t’ > 0, we cannot determine the value of 
the integral. In order to evaluate the integral for t — t’ > 0 we close the contour 
in the lower half-plane where the semi-circle contribution is again well behaved. 


Semi-circle in the lower half-plane The integral around the semi-circle in the 
lower half-plane can also be parametrized using polar coordinates, 


ce | A 
J dwe™?t-G' (k) — -f ire? do ein (cos 6+isin8)(¢—-1) G! (rei) 
£ r 
SC 0 


= -f ire’ dé e ir cos 0 (t De r| sin 0|(¢ OG (re). 
0 
(5.89) 


Now the opposite happens: 


J dwe™™® CIG! (k) =? (t-t <0) 
SC 
=0 t-r >0). (5.90) 


This time the situation is reversed. The value of the integral tells us nothing for 
t—t' < 0. In the second case, however, the contribution to the loop goes to 
zero, making the integral along the real axis equal to the loop integral result in 
eqn. (5.85). 


Piece-wise definition Because of the infinite pieces, we must close the contour 
for the retarded Green function separately for t — t’ > O (lower half-plane, 
non-zero result) and t — t’ < O (upper half-plane, zero result). This is not a 
serious problem for evaluating single Green functions, but the correct choice 
of contour becomes more subtle when calculating products of Green functions 
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using the momentum-space forms. We have nonetheless established that these 
momentum-space prescriptions lead to a Green function which propagates from 
the past into the future: 
G, x) = (Re)! d"k cos(k- Ax) sin(@,; At) TE 
27)" Ox 
=0 (t—t' <0). (5.91) 


5.3.4 Advanced Green function 


The treatment of this function is identical in structure to that for the retarded 
propagator. The only difference is that the poles lie in the opposite half-plane, 
and thus the results are reversed: 


1 


—(ko — ie)? + we 


G,(x, x’) = — (fc)! J (dk) e^ (5.92) 


We see that the poles are shifted above the axis and that the complex contour 
may now be completed in the opposite manner to the retarded Green function. 
The result is 

d"k_ sin(k - Ax — œ At) 
2r)” Ox 
=0 (t—t' > 0). (5.93) 


G,(x, x’) = — (fc)! @=7-20) 


5.3.5 Feynman Green function 


, on da (dk) el(KAx—(ox +a) At) el(kAx— (a—ay) At) 
Gr(x, x) = 


Rej) Wn Qn)| (atie)2a, (a —ie)2a, 
(5.94) 


Shifting œ — «æ — œx in the first fraction and æ —> «œ + œx in the second fraction, 
and re-labelling œ — ko we obtain, 


Mes er, elkAx 1 = 1 
G(x, x’) = (h*c) [ 2o e al 


(5.95) 


It is normal to re-write this in the following way. Remember that we are 
interested in the limit € — 0. Combining the partial fractions above, we get 


—1 
(ko + œ — ie) (ko — œ + i€) 


Gp(x, x) = wo f aw all +00]. 


(5.96) 
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Fig. 5.3. Contour in the complex plane for the Feynman Green function. This shows 
how the ie prescription moves the poles effectively from the real axis. 


From this expression, we see that the poles have been shifted from the real axis 
to 


cko = Wk — le 
cko = —œk + i€, (5.97) 


i.e. the negative root is shifted above the axis and the positive root below the axis 
in the ko plane (see figure 5.4). An equivalent contour is shown in figure 5.3. 
Although it does not improve one’s understanding in any way, it is normal in the 
literature to write the Feynman Green function in the following way. Re-writing 
the denominator, we have 


(cko + œ — ie) (cko — œ + ie) = °k — wp + 2iewp +e?. (5.98) 


Now, since € is infinitesimal and @, > 0, we may drop e°, and write 2iew, = ie’. 
This allows us to write 


elkAx 


Gp(x, x’) = ef (dk) (5.99) 


p?c? + m2c4 — ie’ 
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Fig. 5.4. Contour in the complex plane for the Feynman Green function. Here we 
bend the contour rather than moving the poles. The result is identical. 


5.3.6 Comment on complex contours 


The procedure described by Green functions is a formalism for extracting 
solutions to the inverse-operator problem. It has a direct analogy in the theory of 
matrices or linear algebra. There the issue concerns the invertibility of matrices 
and the determinant of the matrix operator. Suppose we have a matrix equation 


M-x=J, (5.100) 


with a matrix M given by 


a b 
m=( e a (5.101) 


If this matrix has an inverse, which is true if the determinant ad — bc does not 
vanish, 


1 _ 
M! = ——__ eee. , (5.102) 
ad—-bc\—c a 
then eqn. (5.100) has a unique solution. We would not expect this case to 
correspond to the solution of a differential equation such as the one we are 


considering, since we know that the general solution to second-order differential 
equations usually involves a linear super-position of many solutions. 
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If the determinant of M does vanish, then it means that there is an infinite 
number of solutions, which corresponds to a sub-space of x (a hyper-surface 
which is determined by a constraint linking the coordinates). In this case, the 
inverse defined above in eqn. (5.102) has a pole. For example, suppose we take 
M to be the matrix 


121 
M= 1 1 01, (5.103) 
4 8 4 
and 
4 
J= 2 : (5.104) 
16 


This matrix clearly has no inverse, since the third row is a multiple of the first. 
The determinant vanishes, but in this trivial case we can solve the equations 
directly. Since there are only two independent equations and three unknowns, it 
is not possible to find a unique solution. Instead, we eliminate all but one of the 
variables, leaving 


X2 + X3 = 2. (5.105) 


This is the equation of a straight line, or a sub-space of the full three-dimensional 
solution space. We regard this as an incomplete constraint on the solution space 
rather than a complete solution. 


This is analogous to the situation we have with the Green functions. The poles 
indicate that the solution to the differential equation which we are trying to solve 
is not unique. In fact, there is an infinite number of plane wave solutions which 
lie on the hyper-surface k? + m? = 0, called the mass shell. 


5.4 Scalar Green functions in real space 


Although the momentum-space representations of the Green functions are useful 
for calculations, we are usually interested in their forms in real space. For 
general fields with a mass, these can be quite complicated, but in the massless 
limit the momentum-space integrals can be straightforwardly evaluated. 


Again, since the other relativistic Green functions can be expressed in terms 
of that for the scalar field, we shall focus mainly on this simple case. 
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5.4.1 The retarded Green function for n = 3 asm > 0 


From Cauchy’s residue theorem in eqn. (5.76), we have 


, k 21 dk el(k- Ax—ox At) eik- Ax+ok At) 
G,(x, x^) = —2mi (h*c) l 


(27)* 2a, 20; 


(5.106) 


For general m + 0, this integral defines Bessel functions. For m = 0, 
however, the integral is straightforward and can be evaluated by going to 
three-dimensional polar coordinates in momentum space: 


= |r|c 
27 
ja- Te ar f sinodo f do 
0 
k-x = |r|AX cos@, (5.107) 
where AX = |Ax\|, so that 
Cc 
G(x, x’) = ae * (he)! Í 27r’ dr 
m eir Ax cos 0 . . 
x / sin @ dð ———— lou — ekear] . (5.108) 
0 r 


The integral over dO may now be performed, giving 


CO 
20.\- —ir(AX+cAt) 
== (h*c) f drfe 


S ere L gi(AX-cAt) y SAN (5.109) 


G,(x, x’) = 


Note that both At and Ax are positive by assumption. From the definition of 
the delta function, we have 


+00 ; 
2na(x) = | dk elk 


=f [e +e™] . (5.110) 


Using this result, we see that the first and last terms in eqn. (5.109) vanish, since 
Ax can never be equal to — At as both Ax and At are positive. This leaves us 
with 


1 
G(x, x’) = —~——— d(ct — AX 
ae) An?cn AX ( ) 


1 


= —. — ô (ct -t)-|x- x|). 5.111 
Anh?c|x — x'| a y= ) ( ) 
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5.4.2 The G* and Gr for n = 3 asm —> 0 


From eqn. (5.74c) we can see that the Feynman propagator is manifestly equal 
to —G) for t > t' and is equal to GO for t’ < t. The calculation of all three 
quantities can therefore be taken together. We could, in fact, write this down 
from the definitions, but it is useful to use the residue theorem on eqn. (5.95) 
to show the consistency of the procedure with the definitions we have already 
given. In fact, we shall see that the Wightman functions are just the residues, up 
to a sign which depends on the orientation of the closed contour. 

For t — t’ > 0, we complete the contour in the lower half-plane, creating an 
anti-clockwise contour. The residue theorem then tells us that 


d(k) 1 


oe ikAx = i {=1 . (5.112 
On) 2a,° x —2mi {-1} ( ) 


ko=Ox 


f dkoGp(ko) = (We) 


Comparing this equation with eqns. (5.66), we see that this is precisely equal to 
-G (x, x’). In the massless limit with n = 3, we may therefore write 


Pk elk Ax 


(27 )4 2|k|c2 


2 a T Ay {err ax+ean = gPa X cAi] : 
87 2h*c|x’ — x| Jo 


G = -i (fc)! 


(5.113) 


Similarly, for t — t’ > 0, we complete the contour in the upper half-plane, 
creating a clockwise contour. This gives 


dik) 1 . 
f dkoGi(ko) = (h7c)7! J aa —— el Axtor AD X 2i {+1} (5.114) 
(27) 2a, 
ko=wk 
Comparing this equation with eqn. (5.66), we see that this is precisely equal to 
G(x, x’), and 


Avs 1 A dr fe" AX-eA0) L e(AX+cAD) (5,115) 
87 2h7c|x’ — x| Jo 
It may be checked that these expressions satisfy eqn. (5.67). Finally, we may 


piece together the Feynman Green function from G. Given that the Ar are 
assumed positive, we have 


1 2 ; 
Gr(x, x’) z / dr e (AX) = el (AX) a —ircAr 
87 2h7c|x — x’| Jo 


pe oo . 1 
= E] | dr sin(r|x’—x|)e "|, (5.116) 
a 0 
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We may note that the difference between the retarded and Feynman Green 
functions is 


œ 
G(x, x’) x G(x, x^) = lim 2 f dr ei (cAt—AX +i) E el (CAt+AX tia) 
a—>0 0 
i i 
> teat epee) exter ri 
(5.117) 


where a is introduced to define the infinite limit of the complex exponential. 
This difference is a purely imaginary number, which diverges on the light cone. 


5.4.3 Frequency-dependent form of Gg and G, inn = 3 


In atomic physics and optics, one usually deals in implicitly translational 
invariant systems, in the rest frames of an atom, where the frequency w and time 
are the only variables entering physical models. To use standard field theoretical 
methods in these cases, it is useful to have the Green functions in such a form, 
by integrating over spatial wavenumbers leaving only the Fourier transform over 
time. These are obtained trivially by re-writing the non-zero contributions to 
eqns. (5.109) and (5.116) with r > w/c: 


—i ee w : 1 
Gr(x, x’) = | dw sin (1 — x'\) e~el | 
47r?ħ*c?|x — x’| Jo c 
/ 1 2 w / A 
Gx, x)= — z> dw cos (|x — x'| - olt = r'1). 
47r?ħ*c?|x — x’| Jo c 


(5.118) 


5.4.4 Euclidean Green function in 2 + 0 dimensions 


In the special case of a space-only Green function (the inverse of the Laplacian 
operator), there is no ambiguity in the boundary conditions, since the Green 
function is time-independent and there are no poles in the integrand. Let us 
define the inverse Laplacian by 


(=V? +m) g(x, x’) = d(x, x’). (5.119) 
To evaluate this function, we work in Fourier space and write 
dk ekax’) 


je 5.120 
On) k2 + m2 ene 


gax, x) = 


where k? = k? + kå. Expressing this in polar coordinates, we have 


, / E A [ rdrd@ ell |x—x'| cos 6 5 121) 
Da i 0 0 (27)? r? +m? ` : 
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Massless case In the massless limit, this integral can be evaluated straight- 
forwardly using a trick which is frequently useful in the evaluation of Fourier 
integrals. The trick is only valid strictly when x 4 x’, but we shall leave it as 
an exercise to show what happens in that case. The integral is then evaluated by 
setting m to zero in eqn. (5.121) and cancelling a factor of r from the integration 
measure. To evaluate the expression, we differentiate under the integral sign 
with respect to the quantity |x — x’|: 


20 
6. Tane 
-x)= Í [ e etlx Icos? dy dø. (5.122) 
ae (Qn)? 


Notice that this cancels a factor of r in the denominator, which means that 
the integral over r is now much simpler. Formally, we have 


AES Ge, eaten i 
ee ir|x—x"| cos 0 
g& =x) Í Ga” 


CO 


(5.123) 


d|x — x’| j 


There is still a subtlety remaining, however: since we are integrating a complex, 
multi-valued function, the limit at infinity has an ambiguous limit. The limit can 
be defined uniquely (analytically continued) by adding an infinitesimal positive 
imaginary part to r, so that r —> r(i + €) and letting € — 0 afterwards. This 
makes the infinite limit converge to zero, leaving only a contribution from the 
lower limit: 


20 d8 1 ee) 
TTT — x’) = lim (ir—er)|x—x'| cos 6 
d|x — ree x) e>0Jo (27)? 1—ie 
7 [ = (5.124) 
Jy mr)? |x = x'|" 


To complete the evaluation, we evaluate the two remaining integrals trivially, 
first the anti-derivative with respect to |x — x’|, which gives rise to a logarithm, 
and finally the integral over 6, giving: 


1 
g(x, x’) = —— In |x — x’, (5.125) 
20 


where it is understood that x Æ x’. 


5.4.5 Massive case 


In the massive case, we can write down the result in terms of Bessel functions 
J,, Ky, by noting the following integral identities [63]: 


2 v T . l 
J (z) = sau etz0089 sin?” 9 do (5.126) 
ro +G) 
2T (u +1) [ J (bx) x”t! 
ave be (x? + a?) 


K,_,,(ab) = (5.127) 
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From the first of these, we can choose v = 0 and use the symmetry of the cosine 
function to write 


1 2r ; 
Jo(z) = — Í e1088 g9, (5.128) 
27 0 


Egn. (5.121) may now be expressed in the form 


© rdr Jo(r|x — x'|) 
a= , 5.129 
s Í a (5.129) 


and hence 
1 
g(x, x) = —Ko(m|x — x’). (5.130) 
27 


The massless limit is singular, but with care can be inferred from the small 
argument expansion 


(2) k 
Ko(m(x — x')) = lim —In =) 2 S (5.131) 


5.5 Schrödinger Green function 


Being linear in the time derivative, the solutions of the Schrödinger equation 
have positive definite energy. The Fourier transform may therefore be written 
as, 


o q7 +00 21,2 
i= | £ J (dk) el X84 y (k, A (5) (= -no). 
0) T J—oo 2m 
(5.132) 


This singles out the Schrédinger field amongst the other relativistic fields which 
have solutions of both signs. Correspondingly, the Schrédinger field has only a 
positive energy Wightman function, the negative energy function vanishes from 
the particle theory.’ The positive frequency Wightman function is 


© dG +00 ; 7 h2k2 
Gy (x, x") = —2ri Í = J (dk)ei™® ^84 (&)ő (= -no). 
—0Co 


The negative frequency Wightman function vanishes now, 


GQ, x’) =0, (5.134) 


9 This does not remain true at finite temperature or in interacting field theory, but there remains 
a fundamental asymmetry between positive and negative energy Green functions in the non- 
relativistic theory. 
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since there is no pole in the negative @ plane to enclose. Moreover, this means 
that there is no Feynman Green function in the non-relativistic theory, only a 
retarded one. In the non-relativistic limit, both the Feynman Green function and 
the retarded Green function for relativistic particles reduce to the same result, 
which has poles only in the lower half complex @ plane. This non-relativistic 
Green function satisfies the equation 


nv? 
(- a ina, Gnr(x, x’) = d(x, x’)d(t, t^). (5.135) 
m 


This Green function can be evaluated from the expression corresponding to 
those in eqns. (5.74): 


Gyr(x, x’) = —0 (t — NGR, x’). (5.136) 


Using eqn. (5.75) in eqn. (5.133), we have 


+00 © dae +00 
Gyr(x,x’) = — J da J an J (dk) 
—0oo 0 20 —oo 


elk Ax—(@+a) Ar) h2k2 
ô —ha). 5.137 
“aLa ( Fi a) (37) 


The integral over œ can be shifted, œ — œ — @, without consequences for the 
limits or the measure, giving 


+00 © dé +00 
Gyr(x, x’) = -f da f = | (dk) 
—0o 0 27 —0o 
ei(k-Ax—aAr) fi2k2 
— ô (= — na) ; (5.138) 


x = 5 
(a — @) + 1€ 2m 


We may now integrate over @ to invoke the delta function. Noting that the 
argument of the delta function is defined only for positive @, and that the integral 
is also over this range, we have simply 


+00 +00 elk Ax—aAr) 
Gyr(x, x’) = -f da f (dk) 7a ; (5.139) 
-æ Jo (ħa = ao +ie 


or, re-labelling a > a, 


+00 +00 ei(k Ax—@Ar) 
Gyr(x, x’) =f a f (dk) ; (5.140) 


h?k2 ~ : 
(oe = no) =e 
In spite of appearances, the parameter @ is not really the energy of the system, 
since it runs from minus infinity to plus infinity. It should properly be regarded 
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only as a variable of integration. It is clear from this expression that the 
Schrödinger field has a single pole in the lower half complex plane. It therefore 
satisfies purely retarded boundary conditions. We shall see in section 13.2.2 
how the relativistic Feynman Green function reduces to a purely retarded one in 
the non-relativistic limit. 


5.6 Dirac Green functions 


The Dirac Green function satisfies an equation which is first order in the 
derivatives, but which is matrix-valued. The equation of motion for the Dirac 
field, 


(-iy“d, +m)W = J, (5.141) 


tells us that a formal solution may be written as 


y= J AdVy S(x, x’) J (x’), (5.142) 
where the spinor Green function is defined by 
(—ihcy“, + mc’)S(x, x’) = 6(x, x’). (5.143) 


Although this looks rather different to the scalar field case, S(x, x’) can be 
obtained from the expression for the scalar propagator by noting that 


(—ihcy"d, + me’) (ihcy" ð, + mc’) 


1 
= -ReO +m? cA+ sly", ¥"]0,9y, (5.144) 


and the latter term vanishes when operating on non-singular objects. It follows 
for the free field that 


(icy"d, + mc*)G™ (x, x') = SP (x, x’) (5.145) 
(ihcy"d, +mc?)Gp(x, x’) = Sp(x, x’) (5.146) 
(—ifcy“d, + mc’?)S™ (x, x) =0 (5.147) 
(-ihcy"d,, + mc*)Sp(x, x’) = 8(x, x’). (5.148) 


5.7 Photon Green functions 


The Green function for the Maxwell field satisfies the (n+ 1) dimensional vector 
equation 


[-0 8,” + 4,8") AY“ (x) = pod”. (5.149) 
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As usual, we look for the inverse of the operator,!° which satisfies 


[-5 8; + 8,,0”] De (x, x’) = pocd P(x, x’). (5.150) 
Formally, it can be written as a Fourier transform: 
! ik(x—x’) | Suv ktk” 
D(x, x’) = noe f awe" ) E mia | (5.151) 


In this case, however, there is a problem. In inverting the operator, we are 
looking for a constraint which imposes the equations of motion. For scalar 
particles, this is done by going to momentum space and constructing the Green 
function, which embodies the equations of motion in the dispersion relation 
k? + m? = 0 (see eqn. (5.40)). In this case, that approach fails. 

The difficulty here is the gauge symmetry. Suppose we consider the determi- 
nant of the operator in eqn. (5.149). A straightforward computation shows that 
this determinant vanishes: 


—O] + 33 0! 
a,0° —O + 4,0! 


= 0. (5.152) 


In linear algebra, this would be a signal that the matrix was not invertible, the 
matrix equivalent of dividing by zero. It also presents a problem here. The 
problem is not that the operator is not invertible (none of the Green function 
equations are invertible when the constraints they impose are fulfilled, since 
they correspond precisely to a division by zero), but rather that it implies no 
constraint at all. In the case of a scalar field, we have the operator constraint, or 
its momentum-space form: 


-Keo +m*c*=0 
pee +m = 0. (5.153) 


In the vector case, one has 


det[—0 5,’ + 4,a”] = 0, (5.154) 


but this is an identity which is solved for every value of the momentum. Thus, 
the Green function in eqn. (5.151) supplies an infinite number of solutions for 
A, for every J, one for each unrestricted value of k, which makes eqn. (5.151) 
singular. 

The problem can be traced to the gauge symmetry of the field A,,(x). Under 
a gauge transformation, A, — A, + 0,,s, but 


[-5 8, + 3,8”| s) =0 (5.155) 


10 Note that the operator has one index up and one index down, thereby mapping contravariant 
eigenvectors to contravariant eigenvectors 


108 5 Classical field dynamics 
for any function s(x). It can be circumvented by breaking the gauge symmetry 


in such a way that the integral over k in eqn. (5.151) is restricted. A convenient 
choice is the so-called Lorentz gauge condition 


3 A" =0. (5.156) 
This can be enforced by adding a Lagrange multiplier to the Maxwell action, 
1 Hv u 1 —l/gu 2 
S— | (dx) TA Fyd Au + zro OA t (5.157) 


so that eqn. (5.149) is modified to 


|- ô + (1 — z) na| A") =J”. (5.158) 


It may now be verified that the determinant of the operator no longer vanishes 
for all æ; thus, a formal constraint is implied over the k,,, and the Green function 
may be written 


1 ik(x—x’) | Euv k k, 
Dus, x) = cuo f ae [ietan a iF (5.159) 


This constraint is not a complete breakage of the gauge symmetry, since one 
may gauge transform eqn. (5.156) and show that 


ð A" > ð LA” +O s(x) = 0. (5.160) 


Thus, the gauge condition still admits restricted gauge transformations such that 
Li s(x) =0. (5.161) 


However, this modification is sufficient to obtain a formal Green function, and 
so the additional gauge multi-valuedness is often not addressed. 


5.8 Principal values and Kramers—Kronig relations 


Green functions which satisfy retarded (or advanced) boundary conditions 
satisfy a special pair of Fourier frequency-space relations, called the Kramers— 
Kronig relations (these are also referred to as Bode’s law in circuit theory), 
by virtue of the fact that all of their poles lie in one half-plane (see figure 
5.5). These relations are an indication of purely causal or purely acausal 
behaviour. In particular, physical response functions satisfy such relations, 
including the refractive index (or susceptibility, in non-magnetic materials) and 
the conductivity. 
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Fig. 5.5. Contour in the complex plane for the Kramers—Kronig relations. 


Cauchy’s integral formula states that the value of a function G (œw), which is 
analytic at every point within and on a closed curve C, and is evaluated at a point 
œ = z, is given by the integral around the closed loop C of 


f SO) 25 ae. (5.162) 
COZ 


If a point P lies outside the closed loop, the value of the integral at that point is 
zero. Consider then a field G(t—t’) which satisfies retarded boundary conditions 


d ; , 
G(t-t)= J 2 ei G Ww), (5.163) 
27 
The Fourier transform G (œ), where 
G (w) = J dese OG — t") (5.164) 


is analytic in the upper half-plane, as in figure 5.5, but has a pole on the real 
axis. In the analytic upper region, the integral around a closed curve is zero, by 


Cauchy’s theorem: 
G(œw)d 
$ eee) (5.165) 
c 


@w=z 
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where we assume that G(z) has a simple pole at œ = z. We can write the 
parts of this integral in terms of the principal value of the integral along the 
real axis, plus the integral around the small semi-circle enclosing the pole. The 
integral over the semi-circle at infinity vanishes over the causal region, since 
exp(iw(t — t’)) converges if t — t’ > 0 and œw has a positive imaginary part. 
Around the semi-circle we have, letting œ — z = € e”, 


f G(w)dw [ G (eet jiee!’ da 
=-— lim ———— 
S 0 


c @—z e—0 cei? 
= —i(ee® + z) 
e>0 
= —inG(z). (5.166) 
Then we have 

Glad ° G(w)d 
g C da Pf ONE eh 20) (5.167) 

Cc WO-Z -œ WZ 


The first term on the left hand side is the so-called principal value of the integral 
along the real axis. For a single pole, the principal value is defined strictly by 


the limit 
+00 pi-€ o0 
P| = lim tf +f | , (5.168) 
—0o €>0 [ J—o0 ite 


which approaches the singularity from equal distances on both sides. The 
expression may be generalized to two or more poles by arranging the limits 
of the integral to approach all poles symmetrically. Thus, if we now write the 
real and imaginary parts of G(w) explicitly as 


G(z) = Gr(z) + iG1(2), (5.169) 


and substitute this into eqn. (5.167), then, comparing real and imaginary parts 


we have: 
°° Gr(w)dw 
p [7 ete «non 


oo WO-Z 
© Gi(w)d 

rf E e ey (5.170) 
9 O-Z% 


These are the so-called Kramers—Kronig relations. They indicate that the 
analyticity of G(t — t’) implies a relationship between the real and imaginary 
parts of G(t — t’). 

The generalization of these expressions to several poles along the real axis 
may be written 


P J. CURED: X Er Grik). (5.171) 


0° ane poles 
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The integral along the real axis piece of the contour may be used to derive an 
expression for the principal value of 1/œw. From eqn. (5.167), we may write 


1 1 
= P—— —ind(w — z). (5.172) 
wO-Z O-Z 


This relation assumes that we have integrated along the real axis in a positive 
direction, avoiding a single pole on the real axis by passing above it (or, 
equivalently, by pushing the pole into the lower half-plane by an infinitesimal 
amount ie). Apart from these assumptions, it is quite general. It does not make 
any other assumptions about the nature of G(w), nor does it depend on the 
presence of any other poles which do not lie on the real axis. It is a property 
of the special contour segment which passes around one pole. Had the contour 
passed under the pole instead of over it, the sign of the second term would have 
been changed. These results can be summarized and generalized to several poles 
on the real axis, by writing 


1 1 


=P it d(w@ — Z;), 5.173 
se ra (w —z;) (5.173) 


where z is a general point in the complex plane, z; are the poles on the real axis 
and € — O is assumed. The upper sign is that for passing over the poles, while 
the lower sign is for passing under. 


5.9 Representation of bound states in field theory 


Bound states are states in which ‘particles’ are completely confined by a 
potential V (x). Confinement is a simple interaction between two different fields: 
a dynamical field w(x) and a static confining field V (x). The way in which one 
represents bound states in field theory depends on which properties are germane 
to the description of the physical system. There are two possibilities. 

The first alternative is the approach traditionally used in quantum mechanics. 
Here one considers the potential V(x) to be a fixed potential, which breaks 
translational symmetry, e.g. 


2 
-5v + væ) W(x) = ið W(x). (5.174) 
One then considers the equation of motion of w(x) in the rest frame of this 
potential and solves it using whatever methods are available. A Green function 
formulation of this problem leads to the Lippman—Schwinger equation for 
example (see section 17.5). In this case, the dynamical variable is the field, 
which moves in an external potential and is confined by it, e.g. electrons moving 
in the spherical hydrogen atom potential. 
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A second possibility is to consider bound states as multi-level, internal 
properties of the dynamical variables in question. For instance, instead of 
formulating the motion of electrons in a hydrogen atom, one formulates the 
motion of hydrogen atoms with internal electron levels which can be excited. 
To do this, one introduces multiplet states (an index A on the field and on the 
constant potential), e.g. 


2 
-5v + va) yax) = ið yax). (5.175) 
2m 


This is an effective theory in which one takes the average value of the potential 
Va at N different levels, where A = 1,..., N. The values of V4 signify the 
energy differences between levels in the atom. The field Y4 now represents 
the whole atom, not the electron within in. Clearly, all the components of yr, 
move together, according to the same equation of motion. The internal indices 
have the character of a broken internal ‘symmetry’. This approach allows one to 
study the dynamics and kinematics of hydrogen atoms in motion (rather than the 
behaviour of electrons in the rest frame of the atom). Such a study is of interest 
when considering how transitions are affected by sources outside the atom. An 
example of this is provided by the classic interaction between two levels of a 
neutral atom and an external radiation field (see section 10.6.3). This approach 
is applicable to laser cooling, for instance, where radiation momentum has a 
breaking effect on the kinetic activity of the atoms. 


6 


Statistical interpretation of the field 


6.1 Fluctuations and virtual processes 


Although it arises naturally in quantum field theory from unitarity, the Feynman 
Green function does not arise naturally in classical field theory. It contains 
explicitly acausal terms which defy our experience of mechanics. It has special 
symmetry properties: it depends only on |x — x’|, and thus distinguishes no 
special direction in space or time. It seems to characterize the uniformity of 
spacetime, or of a physical system in an unperturbed state. 

The significance of the Feynman Green function lies in the effective under- 
standing of complex systems, where Brownian fluctuations in bulk have the 
macroscopic effect of mixing or stirring. In field theory, its use as an intuitive 
model for fluctuations allows the analysis of population distributions and the 
simulation of field decay, by spreading an energy source evenly about the 
possible modes of the system. 


6.1.1 Fluctuation generators: G(x, x’) and Gg(x, x’) 


The Feynman Green function is related to the Green function for Euclidean 
space. Beginning with the expression in eqn. (5.99), one performs an anti- 
clockwise rotation of the integration contour (see figure 5.3): 


kt = iko. (6.1) 


There are no obstacles (poles) which prevent this rotation, so the two expressions 
are completely equivalent. With this contour definition the integrand is positive 
definite and no poles are encountered in an integral over kog: 


1 1 
= . 
=k +k? +m? — ie kôp +k? +m? 


(6.2) 


There are several implications to this equivalence between the Feynman Green 
function and the Euclidean Green function. The first is that Wick rotation to 


113 
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Euclidean space is a useful technique for evaluating Green function integrals, 
without the interference of poles and singularities. Another is that the Euclidean 
propagator implies the same special causal relationship between the source and 
the field as does the Feynman Green function. In quantum field theory, one 
would say that these Green functions formed time-ordered products. 

In the classical theory, the important point is the spacetime symmetry of the 
Green functions. Owing to the quadratic nature of the integral above, it is 
clear that both the Feynman and Euclidean Green functions depend only on the 
absolute value of |x—x’|. They single out no special direction in time. Physically 
they represent processes which do not develop in time, or whose average effect 
over an infinitesimal time interval is zero. 

These Green functions are a differential representation of a cycle of emission 
and absorption (see below). They enable one to represent fluctuations or virtual 
processes in the field which do not change the overall state. These are processes 
in which an excitation is emitted from a source and is absorbed by a sink over 
a measurable interval of time.' This is a doorway to the study of statistical 
equilibria. 

Statistical (many-particle) effects are usually considered the domain of quan- 
tum field theory. their full description, particularly away from equilibrium, 
certainly requires the theory of interacting fields, but the essence of statistical 
mechanics is contained within classical concepts of ensembles. The fact that 
a differential formulation is possible through the Green function has profound 
consequences for field theory. Fluctuations are introduced implicitly through the 
boundary conditions on the Green functions. The quantum theory creates a more 
elaborate framework to justify this choice of boundary conditions, and takes it 
further. However, when it comes down to it, the idea of random fluctuations 
in physical systems is postulated from experience. It does not follow from any 
deeper physical principle, nor can it be derived. Its relationship to Fock space 
methods of counting states is fascinating though. This differential formulation 
of statistical processes is explored in this chapter.” 


6.1.2 Correlation functions and generating functionals 


The Feynman (time-ordered) Green function may be obtained from a generating 
functional W which involves the action. From this generating functional it 
is possible to see that a time-translation-invariant field theory, expressed in 
terms of the Feynman Green function, is analytically related to a statistically 


' Actually, almost all processes can be studied in this way by assuming that the field tends to a 
constant value (usually zero) at infinity. 

2In his work on source theory, Schwinger [118, 119] constructs quantum transitions and 
statistical expectation values from the Feynman Green function A+, using the principle of 
spacetime uniformity (the Euclidean hypothesis). The classical discussion here is essentially 
equivalent to his treatment of weak sources. 
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weighted ensemble of static systems. The action S[@(x)] is already a generating 
functional for the mechanics of the system, as noted in chapter 4. The additional 
generating functional W[J] may be introduced in order to study the statistical 
correlations in the field. This is a new concept, and it requires a new generating 
functional, the effective action. The effective action plays a central role in 
quantum field theory, where the extension to interacting fields makes internal 
dynamics, and thence the statistical interpretation, even more pressing. 

We begin by defining averages and correlated products of the fields. This 
is the route to a statistical interpretation. Consider a field theory with fields 
p^, p`? and action S®. The superscript here denotes the fact that the action 
is one for free fields and is therefore of purely quadratic order in the fields. In 
the following sections, we use the complex field (x) to represent an arbitrary 
field. The same argument applies, with only irrelevant modifications, for general 
fields. We may write 


sO = f (ax) 0 Oan0", (6.3) 


where the Gaussian weighted average, for statistical weight p = exp(iS/s) is 
then defined by 


Tr(pF) 
Trp 

_ [aulolFigle's” 
fdutdless” 


where s is an arbitrary scale with the dimensions of action. In quantum field 
theory, it is normal to use s = h, but here we keep it general to emphasize that 
the value of this constant cancels out of relevant formulae at this classical level. 
Do not be tempted to think that we are now dealing with quantum field theory, 
simply because this is a language which grew up around the second quantization. 
The language is only a convenient mathematical construction, which is not tied 
to a physical model. In this section, we shall show that the Gaussian average 
over pairs of fields results in the classical Feynman Green function. Consider 
the generating functional 


(Fil) = 


(6.4) 


Z [J, J] Z [ow [¢, pijet EA ane-a] (6.5) 


which bears notable similarities to the classical thermodynamical partition 
function. From the definitions above, we may write 


ZT i " i = 
Z0, 0]. = (ap (-= fave Ja — ~ fawsig ) ; (6.6) 
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where the currents J“ and J‘ are of the same type as p^ and øt? , respectively. 
The effective action, as a function of the sources W[J, J], is defined by 


exp (wu. ni) =Z[J,JĦ], (6.7) 


thus W[J, JÝ] is like the average value of the action, where the average is 
defined by the Gaussian integral. Now consider a shift of the fields in the action, 
which diagonalizes the exponent in eqn. (6.6): 


OC LK IOO +L?) — K4OanL? 
= 40486? + b'4OnpL® + K4O 420". (6.8) 


The right hand side of this expression is the original exponent in eqn. (6.5), 
provided we identify 


OanL® (x) = JA(x) (6.9) 

=> LA(x)= J (dx’)(O7!)48 (x, xN Jg’ (6.10) 
and 

K4(x)Oag = J} œ) (6.11) 

=> Ka) = [ax HOO .2, (6.12) 


where f (dx')O7! ABO) BC = Oo: With these definitions, it follows that 
K4OanL® = [ww JÌ (ÔH Jg (6.13) 


and so 


ZU, Jt] = | ‘igs Ste Oe Ae sa 
(6.14) 


We may now translate away L^ and K^, assuming that the functional measure 
is invariant. This leaves 


Z|JeJ | = exp (-: fener) AOS) Z[0,0] (6.15) 
or 


Wi, J] = — J (dx) (dx) JI (ÔH Jg + const. (6.16) 
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By differentiating W[J, J‘] with respect to the source, we obtain 


ôW 
os mis [ax (6.17) 
J} (x Er 
ôW 
a dx’)(O7')4? Jg 6.18 
(°) = 3G) =is [ (ax'y 3 (6.18) 
W 
t4gtP) = is =0 6.19 
(po) Sia (6.19) 
8 W 
^g?) = is——— =0 6.20 
(pg) Bw AE A (6.20) 
&W ^ 
Agi) = is = is (0714? 6.21 
(H p) 515 Ip (07) (6.21) 
One may now identify (O~!)4# as the inverse of the operator in the quadratic 


part of the action, which is clearly a Green function, i.e. 
(p g?) = isG*? (x, x’). (6.22) 


Moreover, we have evaluated the generator for correlations in the field W[J]. 
Returning to real scalar fields, we have 


W[J] = -2 [aww T(x) G4? (x, x) Jg (x^. (6.23) 


We shall use this below to elucidate the significance of the Green function for 
the fluctuations postulated in the system. Notice that although the generator, in 
this classical case, is independent of the scale s, the definition of the correlation 
function in eqn. (6.21) does depend on this scale. This tells us simply the 
magnitude of the fluctuations compared with the scale of W[J] (the typical 
energy scale or rate of work in the system over a time interval). If one takes 

= h, we place the fluctuations at the quantum level. If we take s ~ 87t, 
we place fluctuations at the scale of thermal activity kT .? Quantum fluctuations 
become unimportant in the classical limit A — 0; thermal fluctuations become 
unimportant in the low-temperature limit 6 — oo. At the level of the present 
discussion, the results we can derive from the correlators are independent of 
this scale, so a macroscopic perturbation would be indistinguishable from a 
microscopic perturbation. It would be a mistake to assume that this scale were 
unimportant however. Changes in this scaling factor can lead to changes in 
the correlation lengths of a system and phase transitions. This, however, is the 
domain of an interacting (quantum) theory. 


3 These remarks reach forward to quantum field theories; they cannot be understood from the 
simple mechanical considerations of the classical field. However they do appeal to one’s 
intuition and make the statistical postulate more plausible. 
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We have related the generating functional W[J] to weighted-average products 
over the fields. These have an automatic symmetry in their spacetime arguments, 
so it is clear that the object G4? (x, x’) plays the role of a correlation function for 
the field. The symmetry of the generating functional alone implies that (O-!)‘/ 
must be the Feynman Green function. We shall nevertheless examine this point 
more closely below. 

A note of caution to end this section: the spacetime symmetry of the Green 
function follows from the fact that the integrand in 


ik(x—x’) 


G65 J (dk) (6.24) 


Pm 
is a purely quadratic quantity. A correlator must depend only on the signless 
difference between spacetime events |x — x’|, if it is to satisfy the relations 
in the remainder of this section on dissipation and transport. If the spectrum 
of excitations were to pick up, say, an absorbative term, which singled out a 
special direction in time, this symmetry property would be spoiled, and, after an 
infinitesimal time duration, the Green functions would give the wrong answer 
for the correlation functions. In that case, it would be necessary to analyse 
the system more carefully using methods of non-equilibrium field theory. In 
practice, the simple formulae given in the rest of this chapter can only be applied 
to derive instantaneous tendencies of the field, never prolonged instabilities. 


6.1.3 Symmetry and causal boundary conditions 


There are two Green functions which we might have used in eqn. (6.21) as the 
inverse of the Maxwell operator; the retarded Green function and the Feynman 
Green function. Both satisfy eqn. (5.62). The symmetry of the expression 


W = -5 [wawe xNI XN (6.25) 


precludes the retarded function however. The integral is spacetime-symmetrical, 
thus, only the symmetrical part of the Green function contributes to the integral. 
This immediately excludes the retarded Green function, since 


W, = r: [wanwa 


= > | IOa) + GO, x’)JI(x’) 


1 ‘i 
= -5 [anoo (x, x) — GP, DIa” 
S, (6.26) 
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where the last line follows on re-labelling x, x’ in the second term. This relation 
tells us that there is no dissipation in one-particle quantum theory. As we shall 
see, however, it does not preclude dissipation by re-distribution of energy in 
“‘many-particle’ or statistical systems coupled to sources. See an example of this 
in section 7.4.1. Again, the link to statistical systems is the Feynman Green 
function or correlation function. The Feynman Green function is symmetrical 
in its spacetime arguments. It is straightforward to show that 


W = -5 [aaar xNI XN 


= -3 [OTE o. (6.27) 


The imaginary part of G(x, x’) is 
Im G(x, x’) = 2ImG™ (x, x’). (6.28) 


6.1.4 Work and dissipation at steady state 


Related to the idea of transport is the idea of energy dissipation. In the presence 
of a source J, the field can decay due to work done on the source. Of course, 
energy is conserved within the field, but the presence of fluctuations (briefly 
active sources) allows energy to be transferred from one part of the field to 
another; i.e. it allows energy to be mixed randomly into the system in a form 
which cannot be used to do further work. This is an increase in entropy. 

The instantaneous rate at which the field decays is proportional to the imagi- 
nary part of the Feynman Green function. In order to appreciate why, we require 
a knowledge of the energy-momentum tensor and Lorentz transformations, so 
we must return to this in section 11.8.2. Nonetheless, it is possible to gain a 
partial understanding of the problem by examining the Green functions, their 
symmetry properties and the roles they play in generating solutions to the field 
equations. This is an important issue, which is reminiscent of the classical theory 
of hydrodynamics [53]. 

The power dissipated by the field is the rate at which field does work on 
external sources,’ 

dw 
dt’ 
Although we cannot justify this until chapter 11, let us claim that the energy of 
the system is determined by 


(6.29) 


ôS] 
ôt 


energy = — (6.30) 


=f GJ 


4 Note that we use a small w for work since the symbol W is generally reserved to mean the 
value of the action, evaluated at the equations of motion. 
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So, the rate of change of energy in the system is equal to minus the rate at which 
work is done by the system: 


d dw (6.31) 
— energy = ——. . 
dp et ae 
Let us define the action functional W by 
ôW SSI, J 
ot OSD yd (6.32) 
ôJ ôJ  lọ=fGJ 


where the minus sign is introduced so that this represents the work done by 
the system rather than the energy it possesses. The object W clearly has the 
dimensions of action, but we shall use it to identify the rate at which work is 
done. Eqn. (6.32) is most easily understood with the help of an example. The 
action for a scalar field is 


ôJS = — fæ ôJọ(x), (6.33) 


so, evaluating this at 
(x)= fæ GOI); (6.34) 
one may write, up to source-independent terms, 
W[J] = -5 [ww J (x)Gp(x, x) J (x’). (6.35) 


This bi-linear form recurs repeatedly in field theory. Schwinger’s source theory 
view of quantum fields is based on this construction, for its spacetime symmetry 
properties. Notice that it is based on the Feynman Green function. Eqn. (6.34) 
could have been solved by either the Feynman Green function or the retarded 
Green function. The explanation follows shortly. The work done over an 
infinitesimal time interval is given by 


dw 
Aw = Im—. 6.36 
w m i ( ) 


Expressed in more useful terms, the instantaneous decay rate of the field is 


fore = -Z mw. (6.37) 


The sign, again, indicates the difference between work done and energy lost. 
The factor of x, is included because we need a scale which relates energy and 
time (frequency). In quantum mechanics, the appropriate scale is x, = fi. In 
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fact, any constant with the dimensions of action will do here. There is nothing 
specifically quantum mechanical about this relation. The power is proportional 
to the rate of work done. The more useful quantity, the power spectrum P (œw) or 
power at frequency w, is 


P(w,t 
[vo CE Baer (6.38) 
ho 
giving the total power 
P= fioro. (6.39) 


We speak of the instantaneous decay rate because, in a real analysis of dissipa- 
tion, the act of work being done acts back on all time varying quantities. Taking 
the imaginary part of W to be the decay rate for the field assumes that the system 
changes only adiabatically, as we shall see below. 


6.1.5 Fluctuations 


The interpretation of the field as a statistical phenomenon is made plausible 
by considering the effect of infinitesimal perturbations to the field. This may 
be approached in two equivalent ways: (i) through the introduction of linear 
perturbations to the action, or sources 


S>S- fæ Jo, (6.40) 


where J is assumed to be weak, or (ii) by writing the field in terms of a 
fluctuating ‘average’ part (ġ) and a remainder part p, 


$a) = ($x) +p). (6.41) 


These two constructions are equivalent for all dynamical calculations. This can 
be confirmed by the use of the above generating functionals, and a change of 
variable. 

It is worth spending a moment to consider the meaning of the function W[J]. 
Although originally introduced as part of the apparatus of quantum field theory 
[113], we find it here completely divorced from such origins, with no trace of 
quantum field theoretical states or operators (see chapter 15). The structure 
of this relation is a direct representation of our model of fluctuations or virtual 
processes. W[J] is the generator of fluctuations in the field. The Feynman Green 
function, in eqn. (6.25), is sandwiched between two sources symmetrically. 
The Green function itself is symmetrical: for retarded times, it propagates a 
field radiating from a past source to the present, and for advanced times it 
propagates the field from the present to a future source, where it is absorbed. 
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The symmetry of advanced and retarded boundary conditions makes W[J] an 
explicit representation of a virtual process, at the purely classical level. 
The first derivative of the effective action with respect to the source is 


W 
~ = ($ &)), (6.42) 


ôJ (x) 
which implies that, for the duration of an infinitesimal fluctuation J ~ 0, the 
field has an average value. If it has an average value, then it also deviates from 
this value, thus we may write 


ox) = 2 too) (6.43) 
x) = —— x), ; 
MO 

where (x) is the remainder of the field due to J. The average value vanishes 
once the source is switched off, meaning that the fluctuation is the momentary 
appearance of a non-zero average in the local field. This is a smearing, stirring or 
mixing of the field by the infinitesimal generalized force J. The rate of change 
of this average is 


OWL] j i 
Gh) — = (P(x) b(*')) — (9a) p )). (6.44) 
This is the correlation function C4, (x, x’), which becomes the Feynman Green 
function as J — 0. It signifies the response of the field to its own fluctuations 
nearby, i.e. the extent to which the field has become mixed. The correlation 
functions become large as the field becomes extremely uniform. This is called 
(off-diagonal®) long-range order. 

The correlation function interpretation is almost trivial at the classical (free- 
field) level, but becomes enormously important in the interacting quantum 


theory. 


Instantaneous thermal fluctuations Fluctuations have basically the same form 
regardless of their origin. If we treat all thermal fluctuations as instantaneous, 
then we may account for them by a Euclidean Green function; the fluctuations 
of the zero-temperature field are generated by the Feynman Green function. In 
an approximately free theory, these two are the same thing. Consider a thermal 
Boltzmann distribution 


Tr(p(x, xep x’) = Trep (@)$ (—@)). (6.45) 


5 For detailed discussions of these points in the framework of quantum field theory, see the 
original papers of Feynman [46, 47, 48] and Dyson [41]. The generator W[J] was introduced 
by Schwinger in ref. [113]. 

6 ‘Off-diagonal’ refers to x Æ x’. 
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Since the average weight is e'5/”, and the Green function in momentum space 
involves a factor exp(—iw(t — f’)), one can form a representation of the 
Boltzmann exponential factor exp(— BE) by analytically continuing 


t—>t—ihB (6.46) 
or 
t' > t+ihB. (6.47) 


This introduces an imaginary time element such as that obtained by Wick 
rotating to Euclidean space. It also turns the complex exponential into a real, 
decaying exponential. If the real part of the time variable plays no significant 
role in the dynamics (a static system), then it can be disregarded altogether. That 
is why Euclidean spacetime is essentially equivalent to equilibrium thermody- 
namics. However, from the spacetime symmetry of the correlation functions, 
we should have the same result if we re-label ż and t’ so 


G(t —t' +ihB) = Git! —t + ih) (6.48) 


or, in the Wick-rotated theory, 


G(te — tk + AB) = G (tk — te + AB). (6.49) 
This is only possible if 
ei@e(e—hb—te) — give (tg—hb—te) (6.50) 
or 
eltPor — ], (6.51) 


From this; we deduce that the Euclidean Green function must be periodic in 
imaginary time and that the Euclidean frequency 


2 
wp (n) = =- ETE E o cs: (6.52) 


where wg(n) are called the Matsubara frequencies. 


Thermal fluctuations in time Using the fluctuation model, we may represent 
a system in thermal equilibrium by the same idealization as that used in 
thermodynamics. We may think of the source and sink for thermal fluctuations 
as being a large reservoir of heat, so large that its temperature remains constant 
at T = 1/kß, even when we couple it to our system. The coupling to the heat 
bath is by sources. Consider the fluctuation model as depicted in figure 6.1. 
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Fig. 6.1. Thermal fluctuations occur when the source is a heat reservoir at constant 
temperature. 


Since the fluctuation generator is W[J], which involves 


1 
WLS =, [aww J (x)Gp(x, xI) 
~ J(x)[-G™ (@)6 (past) + GO (w)0 (future) |] Jx’), (6.53) 


then, during a fluctuation, the act of emission from the source is represented 
by —G™ (w) and the act of re-absorption is represented by GO (w). In other 
words, these are the susceptibilities for thermal emission and absorption. In 
an isolated system in thermal equilibrium, we expect the number of fluctuations 
excited from the heat bath to be distributed according to a Boltzmann probability 
factor [107]: 


emission —G (a) hBlol 


absorption GO (œw) ee 
We use fiw for the energy of the mode with frequency w by tradition, though A 
could be replaced by any more appropriate scale with the dimensions of action. 
This is a classical understanding of the well known Kubo—Martin—Schwinger 
relation [82, 93] from quantum field theory. In the usual derivation, one makes 
use of the quantum mechanical time-evolution operator and the cyclic property 
of the trace in eqn. (6.45) to derive this relation for a thermal equilibrium. What 
makes these two derivations equivalent is the principle of spacetime uniformity 
of fluctuations. The argument given here is identical to Einstein’s argument for 
stimulated and spontaneous emission in a statistical two-state system, and the 
derivation of the well known A and B coefficients. It can be interpreted as the 
relative occupation numbers of particles with energy iw. Here, the two states 
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Field 
emission absorption 


heat bath 


Fig. 6.2. Contact with a thermodynamic heat bath. Fluctuations represent emission 
and absorption from a large thermal field. 


are the heat bath and the system (see figure 6.2). We can use eqn. (6.54) to find 
the thermal forms for the Wightman functions (and hence all the others). To do 
so we shall need the extra terms X (k) mentioned in eqn. (5.41). Generalizing 
eqns. (5.66), we write, 


GO (k) = —2mi[O (ko) + X]8(p?c? + m?c’) 
GO (k) = 2mi[6(—ko) + Y]8(p?c? + m*c*) (6.55) 


with X and Y to be determined. The commutator function G(x, x") represents 
the difference between the emission and absorption processes, which cannot 
depend on the average state of the field since it represents the completeness 
of the dynamical system (see section 14.1.8 and eqn. (5.73)). It follows that 
X = Y. Then, using eqn. (6.54), we have 


Olo) + X =e"? [6(—w) + X] (6.56) 
and hence 
X (efel — 1) = 0 (w), (6.57) 
since 0(—w)ef® = 0. Thus, we have 
X =0(w)f (Œ), (6.58) 


where 


1 
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This is the Bose-Einstein distribution. From this we deduce the following 
thermal Green functions by re-combining G® (k): 


GP (k) = —27i0 (ko) [1 + f (kol)15(p7c? + m’c*) (6.60) 


; 2.2 24 
p?c? + m2c4 — ie + 271i f (|ko|)5(p c+mc )0 (ko). 


Gr(k) = 
(6.61) 


For a subtlety in the derivation and meaning of eqn. (6.59), see section 13.4. 

The additional mixture of states which arise from the external (boundary) 
conditions on the system X thus plays the role of a macrostate close to steady 
state. Notice that the retarded and advanced Green functions are independent of 
X. This must be the case for unitarity to be preserved. 


6.1.6 Divergent fluctuations: transport 


The fluctuations model introduced above can be used to define instantaneous 
transport coefficients in a statistical system. Long-term, time-dependent ex- 
pressions for these coefficients cannot be obtained because of the limiting 
assumptions of the fluctuation method. However, such a non-equilibrium 
situation could be described using the methods of non-equilibrium field theory. 

Transport implies the propagation or flow of a physical property from one 
place to another over time. Examples include 


e thermal conduction, 
e electrical conduction (current), 
e density conduction (diffusion). 


The conduction of a property of the field from one place to another can only be 
accomplished by dynamical changes in the field. We can think of conduction as 
a persistent fluctuation, or a fluctuation with very long wavelength, which never 
dies. All forms of conduction are essentially equivalent to a diffusion process 
and can be analysed hydrodynamically, treating the field as though it were a 
fluid. 

Suppose we wish to consider the transport of a quantity X: we are therefore 
interested in fluctuations in this quantity. In order to generate such fluctuations, 
we need a source term in the action which is generically conjugate to the 
fluctuation (see section 14.5). We add this as follows: 


S> S- fox sF, (6.62) 
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and consider the generating functional of fluctuations W [F] as a function of the 
infinitesimal source F (x); Taylor-expanding, one obtains 


E W10] 
ô? W [0] 


smoare ore ) +. (6.63) 


+ [ww 
Now, since 
W[F] = [ww F (x)(X WOX (x’)) F(x’), (6.64) 


we have the first few terms of the expansion 


W[0] =0 
5W(0] 
SFG) 
Pwo) i 
ranra a OXO)) (6.65) 


Thus, linear response theory gives us, generally, 


= (X (x)) 


(X(x)) = > [OXOX (6.66) 
or 
d(X(x)) i / 
SF) T p OXO. (6.67) 


Since the correlation functions have been generated by the fluctuation generator 
W, they satisfy Feynman boundary conditions; however, in eqn. (6.81) we 
shall derive a relation which may be used to relate this to the linear response 
of the field with retarded boundary conditions. It remains, of course, to 
express the correlation functions of the sources in terms of known quantities. 
Nevertheless, the above expression may be used to determine the transport 
coefficients for a number of physical properties. As an example, consider the 
electrical conductivity, as defined by Ohm’s law, 


Ji = 0;; Ei. (6.68) 
If we write E; ~ 0,A; in a suitable gauge, then we have 
Ji(@) = 0;;0A;(@), (6.69) 
or 
ôJi 


Ty = Gij: (6.70) 
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From eqn. (6.67), we may therefore identify the transport coefficient as the limit 
in which the microscopic fluctuations’ wavelength tends to infinity and leads to 
a lasting effect across the whole system, 


k>0 hw i 


The evaluation of the right hand side still needs to be performed. To do this, 
we need to know the dynamics of the sources J; and the precise meaning of the 
averaging process, signified by (...). Given this, the source method provides us 
with a recipe for calculating transport coefficients. 

In most cases, one is interested in calculating the average transport coeffi- 
cients in a finite temperature ensemble. Thermal fluctuations may be accounted 
for simply by noting the relationship between the Feynman boundary conditions 
used in the generating functional above and the retarded boundary conditions, 
which are easily computable from the mechanical response. We make use of 
eqn. (6.54) to write 


G,(t, t) = -00 -t [GP +60] 
= —0 (t — t)G® [1 — ee]. (6.72) 
The retarded part of the Feynman Green function is 
Gr = —0 (t — t)GO 0M — t’), (6.73) 
so, over the retarded region, 
G(x, x") = (1—e7"?*)Gr(x, x’), (6.74) 
giving 


(lee) 
oij (œ) = lim —__— (J; (@) J;(-@)), (6.75) 
k>0 hw 


for the conductivity tensor, assuming a causal response between source and 
field. The formula in eqn. (6.75) is one of a set of formulae which relate the 
fluctuations in the field to transport coefficients. The strategy for finding such 
relations is to identify the source which generates fluctuations in a particular 
quantity. We shall return to this problem in general in section 11.8.5. 


6.1.7 Fluctuation dissipation theorem 


In a quasi-static system, the time-averaged field may be defined by 


1 ptr 
iene J doar (6.76) 


7-T/2 


6.1 Fluctuations and virtual processes 129 


From the generating functional in eqn. (6.5), we also have 


o ôW[J] 
(P(x)) = ih BG)’ (6.77) 
and further 
apl ki R i 
a a OOP = EG ET N (6.78) 


The field may always be causally expressed in terms of the source, using the 
retarded Green function in eqn. (6.76), provided the source is weak so that higher 
terms in the generating functional can be neglected; thus 


1 t+T/2 
=F f [8 aar (6.79) 
T Jī-r/2 
Now, using eqns. (6.78) and (6.79), we find that 
è È ba) = -Im 3 Grle, x) = ZG, x!) (6.80) 
zs = —Im = —G, A 2 y 
ana PPA ETE p e 


Thus, on analytically continuing to Euclidean space, 
G,(@) = —ħ fw Grlo). (6.81) 


This is the celebrated fluctuation dissipation theorem. It is as trivial as it is 
profound. It is clearly based on assumptions about the average behaviour of 
a statistical system over macroscopic times T, but also refers to the effects of 
microscopic fluctuations over times contained in x — x’. It relates the Feynman 
Green function to the retarded Green function for a time-averaged field; i.e. it 
relates the correlation function, which measures the spontaneous fluctuations 


p= — ($) (6.82) 


in the field, to the retarded Green function, which measures the purely mechan- 
ical response to an external source. The fluctuations might be either thermal or 
quantum in origin, it makes no difference. their existence is implicitly postulated 
through the use of the correlation function. Thus, the truth or falsity of this 
expression lies in the very assumption that microscopic fluctuations are present, 
even though the external source J — 0 on average (over macroscopic time T). 
This requires further elaboration. 

In deriving the above relation, we have introduced sources and then taken the 
limit in which they tend to zero. This implies that the result is only true for an 
infinitesimal but non-zero source J. The source appears and disappears, so that 
it is zero on average, but it is present long enough to change the distribution 
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of modes in the system, little by little. An observer who could resolve only 
macroscopic behaviour would therefore be surprised to see the system changing, 
apparently without cause. This theorem is thus about the mixing of scales. 

The fluctuation dissipation theorem tells us that an infinitesimal perturbation 
to the field, J — 0, will lead to microscopic fluctuations, which can decay by 
mechanical response (mixing or diffusion). The decay rate may be related to the 
imaginary part of the correlation function, but this gives only an instantaneous 
rate of decay since the assumptions we use to derive the expression are valid 
only for the brief instant of the fluctuation.’ 

The Feynman Green function seems to have no place in a one-particle 
mechanical description, and yet here it is, at the classical level. But we have 
simply introduced it ad hoc, and the consequences are profound: we have 
introduced fluctuations into the system. This emphasizes the importance of 
boundary conditions and the generally complex nature of the field. 


6.2 Spontaneous symmetry breaking 


Another aspect of fluctuating statistical theories, which arises in connection with 
symmetry, is the extent to which the average state of the field, (¢), displays 
the full symmetry afforded it by the action. In interacting theories, collective 
effects can lead to an average ordering of the field, known as long-range order. 
The classic example of this is the ferromagnetic state in which spin domains 
line up in an ordered fashion, even though the action allows them to point 
in any direction, and indeed the fluctuations in the system occur completely 
at random. However, it is energetically favourable for fluctuations to do this 
close to an average state in which all the spins are aligned, provided the 
fluctuations are small. Maximum stability is then achieved by an ordered state. 
As fluctuations grow, perhaps by increasing temperature, the stability is lost and 
a phase transition can occur. This problem is discussed in section 10.7, after the 
chapters on symmetry. 


7 The meaning of this ‘theorem’ for Schwinger’s source theory viewpoint is now clear [119]. 
Spacetime uniformity in the quantum transformation function tells us that the Green function 
we should consider is the Feynman Green function. The symmetry of the arguments 
tells us that this is a correlation function and it generates fluctuations in the field. The 
infinitesimal source is a model for these fluctuations. Processes referred to as the decay of 
the vacuum in quantum field theory are therefore understood in a purely classical framework, 
by understanding the meaning of the Feynman boundary conditions. 


7 


Examples and applications 


To expose the value of the method in the foregoing chapters, it is instructive to 
apply it to a number of important and well known physical problems. Through 
these examples we shall see how a unified methodology makes the solution of 
a great many disparate systems essentially routine. The uniform approach does 
not necessarily convey with it any automatic physical understanding, but then 
no approach does. What we learn from this section is how many problems can 
be reduced to the basics of ‘cause followed by effect’, or, here, ‘source followed 
by field’. 


7.1 Free particles 


Solving Newton’s law F = ma using a Green function approach is hardly to 
be recommended for any practical purpose; in fact, it is a very inefficient way 
of solving the problem. However, it is useful to demonstrate how the Green 
function method can be used to generate the solution to this problem. This 
simple test of the theory helps to familiarize us with the working of the method 
in practice. The action for a one-dimensional-particle system is 


l y 
s= fa {-3m4 - Fal. (7.1) 


The variation of the action leads to 


6S = [a {mx — F} 6x + A(mx)dx = 0, (7.2) 
which gives us the equation of motion 
F=mx (7.3) 
and the continuity condition 
A(mx) = 0, (7.4) 


131 


132 7 Examples and applications 


which is the conservation of momentum. The equation of motion can be written 
in the form of ‘operator acting on field equals source’, 


Dx=J, (7.5) 

by rearranging 
0? x(t) = F/m. (7.6) 
Clearly, we can integrate this equation directly with a proper initial condition 


X(to) = Xo, X (to) = v, to give 


F 
x(t) — xo = ami — to) + v(t — to). (7.7) 


But let us instead try to use the Green function method to solve the problem. 
There are two ways to do this: the first is quite pointless and indicates a 
limitation of the Green function approach, mentioned in section 5.2.4. The 
second approach demonstrates a way around the limitation and allows us to see 
the causality principle at work. 


Method 1 The operator on the left hand side of eqn. (7.6) is 07, so we define a 
Green function 


a? G(t, t) = (t, t’). (7.8) 


As usual, we expect to find an integral expression by Fourier transforming the 
above equation: 


; dw eiet") 


This expression presents us with a problem, however: it has a non-simple pole, 
which must be eliminated somehow. One thing we can do is to re-write the 


integral as follows: 
~-f- [fd ‘of 
G(t —t’) = far far fF ew, 
27 


= Ja faso, (7.10) 


where f = t — t'. It should be immediately clear that this is just telling us to 
replace the Green function with a double integration (which is how one would 
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normally solve the equation). We obtain two extra, unspecified integrals: 
x(t) = fo G(t, t')F/m 

= f iia o(t —t)F/m 

x J didi F/m 

= J eLrime =t) +v] 
F 2 

= — (t — to) + v(t — to) + xo. (7.11) 
2m 


So, the result is the same as that obtained by direct integration and for the 
same reason: the Green function method merely adds one extra (unnecessary) 
integration and re-directs us to integrate the equation directly. The problem here 
was that the denominator contained a non-simple pole. We can get around this 
difficulty by integrating it in two steps. 


Method 2 Suppose we define a Green function for the linear differential operator 
0, g(t, t) = d(t, t’). (7.12) 


From section A.2, in Appendix A, we immediately recognize this function as the 
Heaviside step function. (We could take the Fourier transform, but this would 
only lead to an integral representation of the step function.) The solution has 
advanced and retarded forms 


glt, t) =0(t-1’) 
galt, t) = —0 (t — t). (7.13) 


Now we have an integrable function, which allows us to solve the equation in 
two steps: 


ə x(t) fo g(t, t!) F/m 


F 
—(t — t) + 3 x(t’) (t >t’). (7.14) 
m 
Then, applying the Green function again, 
/ / F / / 
x(t) = fa &r(t =t’) [Ze =t) +9 x(t | 


F 2 
= — (t — to) + v(t — to) + xo. (7.15) 
2m 
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Again we obtain the usual solution, but this time we see explicitly the causality 
inferred by a linear derivative. The step function tells us that the solution only 
exists for a causal relationship between force F and response x (t). 


7.1.1 Velocity distributions 


In a field of many particles, there is usually a distribution of velocities or 
momenta within the field. In a particle field this refers to the momenta p; of 
individual localizable particles. In other kinds of field there is a corresponding 
distribution of wavenumbers k; of the wave modes which make up the field. 
The action describes the dynamics of a generic particle, but it does not capture 
the macroscopic state of the field. The macrostate is usually described in terms 
of the numbers of components (particles or modes) with a given momentum or 
energy (the vector nature of momentum is not important in an isotropic plasma). 

The distribution function f is defined so that its integral with respect to the 
distribution parameter gives the number density or particles per unit volume. We 
use a subscript to denote the control variable: 


N= fex fik) 
= fer fo P) 


= fes foy). (7.16) 


This distribution expresses averages of the field. For example, the average 
energy is the weighted average of the energies of the different momenta: 


1 
(E) = >] d"k fy (k) E(k). (7.17) 


7.2 Fields of bound particles 


A field of particles, sometimes called a plasma when charged, is an effective 
field, formed from the continuum approximation of discrete particles. Its 
purpose is to capture some of the bulk dynamics of material systems; it should 
not be confused with the deeper description of the atoms and their sub-atomic 
components in terms of fundamental fields which might focus on quite different 
properties, not relevant for the atoms in a bulk context. The starting point for 
classical analyses of atomic systems coupled to an electromagnetic field is the 
idea that matter consists of billiard-ball atoms with some number density py, 
and that the wavelength of radiation is long enough to be insensitive to the 
particle nature of the atoms. The only important fact is that there are many 
particles whose combined effect in space is to act like a smooth field. When 
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perturbed by radiation (which we shall represent as an abstract source J;) the 
particles are displaced by a spatial vector si where i = 1,2,...,n. The action 
for this system may be written 


1 l a,l 2 : i 
Set = — | (dx) } —x<ms* + -ks —myss — J's; +. (7.18) 
Ox 2 2 


This requires some explanation. The factor of the spatial volume of the total sys- 
tem, o,, reflects the fact that this is an effective average formulation. Dividing 
by a total scale always indicates an averaging procedure. As an alternative to 
using this explicit value, we could use the average density, ọ = m/o,, and other 
parameter densities to express the action in appropriate dimensions. The first 
term is a kinetic energy term, which will describe the acceleration of particles 
in response to the forcing term J'. The second term is a harmonic oscillator 
term, which assumes that the particles are bound to a fixed position s; = 0, just 
as electrons are bound to atoms or ions are bound in a lattice. The effective 
spring constant of the harmonic interaction is x. Because s‘(x) represents the 
displacement of the particles from their equilibrium position, we use the symbol 
s' rather than x‘, since it is not the position which is important, but the deviation 
from equilibrium position. The dimensions of s’ (x) are position divided by the 
square-root of the density because of the volume integral in the action, and s’ (x) 
is a function of x” because the displacement could vary from place to place and 
from time to time in the system. The final term in eqn. (7.18) is a term which will 
provide a phenomenological damping term for oscillations, as though the system 
were leaky or had friction. As we have already discussed in section 4.2, this kind 
of term is not well posed unless there is some kind of boundary in the system 
which can leak energy. The term is actually a total derivative. Nevertheless, 
since this is not a microscopic fundamental theory, it is possible to make sense 
of this as an effective theory by ‘fiddling’ with the action. This actually forces 
us to confront the reason why such terms cannot exist in fundamental theories, 
and is justifiable so long as we are clear about the meaning of the procedure. 
The variation of the action is given, after partial integration, by 


1 f 
ôS = — fæ {m5; + ksi — mysi +mys; — Ji} ôs' 


Ox 
1 : 

+ ~ fa [ms; + mysi] ôs’. (7.19) 
Ox 


The terms containing y clearly cancel, leaving only a surface term. But suppose 
we divide the source into two parts: 


J =J} + Ji, (7.20) 
where J; is postulated to satisfy the equation 


—mys' = Ji. (7.21) 
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This then has the effect of preventing the frictional terms from completely 
disappearing. Clearly this is a fiddle, since we could have simply introduced 
a source in the first place, with a velocity-dependent nature. However, this 
is precisely the point. If we introduce a source or sink for the energy of the 
system, then it is possible to violate the conservational properties of the action 
by claiming some behaviour for Jİ which is not actually determined by the 
action principle. The lesson is this: if we specify the behaviour of a field rather 
than deriving it from the action principle, we break the closure of the system 
and conservation laws. What this tells us is that dissipation in a system has to 
come from an external agent; it does not arise from a closed mechanical theory, 
and hence this description of dissipation is purely phenomenological. Taking 
eqn. (7.21) as given, we have the equation of motion for the particles 


ms' —mys'+Ks = Nhe (7.22) 
with continuity condition 
A (ms +mys) = 0. (7.23) 


It is usual to define the natural frequency w} = «/m and write 


i 


J 
(a7 — yd, + ops (x) = = (7.24) 


If we consider a plane wave solution of the form 


s(x) = J (dk) eifkix'-0 s (k), (1.25) 
then we may write 
Jİ (k 
(—w* +iyo + @)s'(k) = af ) (7.26) 
m 


From this we see that the Green function G;;(x, x’) for si (x) is 


eikix' —ot) 


— (7.27) 
(œ? + iyo + w$) 


Gij(x, x’) = bij J (dk) 
As long as the integral contains both positive and negative frequencies, this 
function is real and satisfies retarded boundary conditions. It is often referred to 
as the susceptibility, x;;. In a quantum mechanical treatment, hwo = Ez — E; is 
the difference between two energy levels. 
Notice that the energy density 


P'E; = J E;(x)Gij(x, x) E; (x’) (dx’) (7.28) 
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cannot be expressed in terms of a retarded Green function, since the above 
expression requires a spacetime symmetrical Green function. The Feynman 
Green function is then required. This indicates that the energy of the field 
is associated with a statistical balance of virtual processes of emission and 
absorption, rather than simply being a process of emission. In general, the 
interaction with matter introduces an imaginary part into the expression for the 
energy, since the Green function idealizes the statistical processes by treating 
them as steady state, with no back-reaction. It thus implicitly assumes the 
existence of an external source whose behaviour is unaffected by the response 
of our system. The energy density reduces to E? in the absence of material 
interactions and the result is then purely real. 


7.3 Interaction between matter and radiation 


Classical field theory is normally only good enough to describe non-interacting 
field theories. A complete description of interactions requires the quantum 
theory. The exception to this rule is the case of an external source. In 
electromagnetism we are fortunate in having a system in which the coupling 
between matter and radiation takes on the form of a linear external source J“, 
so there are many systems which behave in an essentially classical manner. 


7.3.1 Maxwell’s equations 


The interaction between matter and radiation begins with the relativistically 
invariant Maxwell action 


1 v 
s= fav [Fu IA). (7.29) 
The variation of the action, 
ôS = fæ {(0"5A") Fuy — J 8A p} 
= fae) 54-0" Fun) — HAG} + f dot (5A Fu 
= 0, (7.30) 


leads immediately to the field equations for the electromagnetic field interacting 
with charges in an ambient vacuum: 


3LF” = — uoJ”. (7.31) 
The spatial continuity conditions are 


AFi, =0, (1.32) 
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or 


AB; = 0. (7.33) 


7.3.2 Electromagnetic waves 


In the Lorentz gauge, 0“A,, = 0, Maxwell ’s equations (7.31) reduce to 


—O Ay = Ja. (7.34) 
The solution to this equation is a linear combination of a particular integral 
with non-zero J„ and a complementary function with J, = 0. The free-field 
equation, 

—O A, =0, (7.35) 


is solved straightforwardly by taking the Fourier transform: 
atk ikat 

A, (x) = J Qayt À K A„(k). (7.36) 

Substituting into the field equation, we obtain the constraint 

2 
2 Q i 
x(k) = Kk = k"k, = (-3 +k i) = 0. (7.37) 
c 


This is the result we found in eqn. (2.52), obtained only slightly differently. The 
retarded and Feynman Green functions for the field clearly satisfy 


—O D(x, x) = Suv cd (x, x’). (7.38) 


Thus, the solution to the field in the presence of the source is, by analogy with 
eqn. (5.41), 


A(x) = EES 
eee ie 
= fones x) È + xa) I(x’), (7.39) 


where X (k) is an arbitrary and undetermined function. In order to determine 
this function, we need to make some additional assumptions and impose some 
additional constraints on the system. 
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7.3.3 Dispersion and the Faraday effect 


When linearly polarized electromagnetic waves propagate through a magnetized 
medium, in the direction of the applied magnetization, the plane of polarization 
becomes rotated in an anti-clockwise sense about the axis of propagation by an 
amount proportional to z, where z is the distance travelled through the medium. 
The angle of rotation 


y =VBz, (7.40) 


where B is the magnetic field and V is Verdet’s constant, the value of which 
depends upon the dielectric properties of the material. This phenomenon is 
important in astronomy in connection with the polarization of light from distant 
stars. It is also related to optical activity and the Zeeman effect. 

Classical descriptions of this effect usually involve a decomposition of the 
electric field vector into two contra-rotating vectors which are then shown 
to rotate with different angular velocities. The sum of these two vectors 
represents the rotation of the polarization plane. An alternative description can 
be formulated in complex coordinates to produce the same result more quickly 
and without prior assumptions about the system. 

Let us now combine some of the above themes in order to use the action 
method to solve the Faraday system. Suppose we have a particle field, s’ (x), of 
atoms with number density py, which measures the displacement of optically 
active electrons —e from their equilibrium positions, and a magnetic field B = 
B3, which points along the direction of motion for the radiation. In the simplest 
approximation, we can represent the electrons as being charges on springs with 
spring constant K. As they move, they generate an electric current density 


Ji = —epysi. (7.41) 
Since the Faraday effect is about the rotation of radiation’s polarization vector 


(which is always perpendicular to the direction of motion x3), we need only s! 
for i = 1, 2. The action then can be written 


1 1 1 ; : , 
S= — fæ |- masas) + zeBeys (5) +ks's; — J's i 
Ox 
(7.42) 
Here, J‘ is an external source which we identify with the radiation field 


FQ) = -eE œ) = -E F” (a). (7.43) 
Cc 


As is often the case with matter—radiation interactions, the relativistically 
invariant electromagnetic field is split into E’, B’ by the non-relativistically 
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invariant matter field s'. The field equations are now obtained by varying the 
action with respect to ds’: 


ôS = fæ {ms; + eBeijs! + Ks; — Ji} ôs’ 
+ fa [ms; + eBeijs’] ôs. (7.44) 
Thus, the field equations are 
ms; +eBejšs! + ksi = J; = —eE;, (7.45) 


and continuity of the field requires 


A(eBe;js/) = 0. (7.46) 


The first of these is simply the conservation of momentum for the electrons. 
The latter tells us that any sudden jumps in the magnitude of magnetic field 
must be compensated for by a sudden jump in the amplitude of the transverse 
displacement. 

If we compare the action and the field equations with the example in section 
7.2, it appears as though the magnetic field has the form of a dissipative term. 
In fact this is not the case. Magnetic fields do no work on particles. The crucial 
point is the presence of the anti-symmetric matrix €;; which makes the term well 
defined and non-zero. 

Dividing eqn. (7.45) through by the mass, we can defined the Green function 
for the s! (x) field: 


Tto y+ ey ORE = hax) (7.47) 
alt Oo peut eit (ORG, *) ne)» l 


where w = k/m, so that the formal solution for the field is 


six) = J (dx’)Gij (x, xJ E. (7.48) 


Since we are interested in coupling this equation to an equation for the radiation 
field J;, we can go no further. Instead, we turn to the equation of motion (7.34) 
for the radiation. Because of the gauge freedom, we may use a gauge in which 
Ao = 0, this simplifies the equation to 


—O Aj = uod; 
E; = —90,Aj. (7.49) 


Thus, using the Green function D;; (x, x’), 


—O Dij(x, x’) = 83; cê, x’), (7.50) 
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for A;(x), we may write the solution for the electric field formally as 
Ej\(x) = — uo9; fpo, x')(—epysj(x')) = —J;/e. (7.51) 
This result can now be used in eqn. (7.45), giving 


d? 5 eBo 
dr2 + Wo bij + pe six) = 


e Pas 
-Z pyno d f (Dua, (7.52) 


Operating from the left with — 7 , we have 


q? eBo e? . 
(=) (5 F oà) dij + =] sx) = Tg PNS (7.53) 


This is a matrix equation, with a symmetric part proportional to 4;; and an anti- 
symmetric part proportional to ¢;;._ If we take plane wave solutions moving 
along the x3 = z axis, 


i d”+!k i(kez—t) a 
sw | e =D si CSC) 


i atk i(k,2—wt) pi 
E' (x) = eh OU E" (k)ê (x), (7.54) 
(21 n+l 

for the dispersion relation x implied by eqn. (7.53), eqn. (7.53) implies that the 
wavenumber k, must be a matrix in order to find a solution. This is what will 
lead to the rotation of the polarization plane for E;. Substituting the above form 
for si (x) into eqn. (7.53) leads to the replacements 0, —> ik, and ð > —io. 
Thus the dispersion relation is 


wo eBw e 
x= (kK -=> ) | Co? + o); + —ei |- —enuom 5; = 0, 
c m 


m 
(7.55) 
or re-arranging, 
2 
wo ray Pn [(@? + 09)8ij + Fes] 
zi = ay | out (7.56) 


Cato = ey 


This only makes sense if the wavenumber k; is itself a matrix with a symmetric 
and anti-symmetric part: 


kzij = kô;j + keij. (7.57) 
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It is the anti-symmetric part which leads to a rotation of the plane of polarization. 
In fact, k, has split into a generator of linear translation k plus a generator or 
rotations k about the z axis: 


> 1 0 ~ 0 1 
k=Ex( 0 i) 8, aD (7.58) 


The exponential of the second term is 


( cos(kz) a) (7.59) 


— sin(kz) cos(kz) 


so k is the rate of rotation. Using a binomial approximation for small B, we can 
write simply 


eB 
È = 2meEo PN (7 60) 
ee ee) aoe oo | 
Verdet’s constant is defined by the phenomenological relation, 
kz = VBz, (7.61) 
so we have 
N 3/32 
2. (1.62) 


= 2m?c€o|(w@2 — œ?) — (eBw/m)?| 


7.3.4 Radiation from moving charges inn = 3: retardation 


The derivation of the electromagnetic field emanating from a charged particle 
in motion is one of the classic topics of electrodynamics. It is an important 
demonstration of the Green function method for two reasons. First of all, the 
method of Green functions leads quickly to the answer using only straightfor- 
ward algebraic steps. Prior to the Green function method, geometrical analyses 
were carried out with great difficulty. The second reason for looking at this 
example here is that it brings to bear the causal or retarded nature of the physical 
field, i.e. the property that the field can only be generated by charge disturbances 
in the past. This retardation property quickly leads to algebraic pitfalls, since the 
dynamical variables become defined recursively in term of their own motion in 
a strange-loop. Unravelling these loops demonstrates important lessons. 

We begin by choosing the Lorentz gauge for the photon propagator with a = 
1. This choice will give the result for the vector potential in a form which is most 
commonly stated in the literature. Our aim is to compute the vector potential 
A,,(x), and thence the field strength F», for a particle at position x,(¢) which 
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is in motion with speed v = 0;x,(t). The current distribution for a point particle 
is singular, and may be written 


J = qc" 8" (x — xp(t)). (7.63) 


The vector potential is therefore, in terms of the retarded propagator, 


A(x) = Gx, x) x") 


ô (ct — tret)) 
[x — x’| 


where the retarded time is defined by tret = t —|x—x’|/c. Performing the integral 
over x° in the presence of the delta function sets t’ > tret: 


Rex 5(x’—x,(t)), (7.64) 


[ow ) BEC) 


ret ô (x — ret 
Ana) = i f doy i = mm )) (7.65) 
oc |x — x’ | 


Here x is a free continuous coordinate parameter, which varies over all space 
around the charge, and X, (fret) is the retarded trajectory of the charge q. We may 
now perform the remaining integral. Here it is convenient to change variables. 
Let 


J ios -xpt = f dos, (1.66) 
where J = det Jj; and 


Jp Or = GA ap Ge), 


L 
IX p trer 
Otret ax!’ i 


= Sij = (7.67) 
is the Jacobian of the transformation. At this stage, tet is given by tet = t — 
|x — x'|/c, i.e. it does not depend implicitly on itself. After the integration we 
are about to perform, it will. We complete the integration by evaluating the 
Jacobian: 


A 


VY; 
/ L 
0; tret = 
C 
S Vja 
J.: = gi; — —f; 
J l 
ij č 


ded Sd pÈ). (7.68) 


fret 


The last line uses the fact that r; depends only on x;, not on x; fori ¢ j. In this 
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instance, the determinant becomes 1 + THe), giving 


q J d P” (tret) 5 (r) 
Or 
4T €C |x = Xp (fret) FI r| 
Bu (tret) 


Amegck |X — Xp] 


A,x) = 


(7.69) 


where «x = (1 — £ -f), and all quantities (including « itself) are evaluated at 
fret- If we define the light ray r“ as the vector from x, to x, then r” = (r,r) 
and r = |r|, since, for a ray of light, r = cAt = c x r/c. Finally, noting that 


rk = —r" P„, we have the Liénard—Wiechert potential in the Lorentz gauge, 
TO eee Ls ae (7.70) 
Ameoc \r"Bu J pa 


To proceed with the evaluation of the field strength F,,,, or equivalently the 
electric and magnetic fields, it is useful to derive a number of relations which 
conceal subtleties associated with the fact that the retarded time now depends 
on the position evaluated at the retarded time. In other words, the retarded time 
tret satisfies an implicit equation 


_ IX — Xp(tet)| = 4 
c c 


tret — 


(7.71) 


The derivation of these relations is the only complication to this otherwise purely 
algebraic procedure. Differentiating eqn. (7.71) with respect to tret, we obtain 


t ai 
a, +T Pilte) 


E= 
Otret 
ð; (trer) = K7! ! (7.72) 
Moreover, 
1 
(itre) = =z Gir), (7.73) 
(ir) = 0; rir; 
=F! (ðir;), (7.74) 
OX p; 
Or j) = gij — —— (ith 
(irj) = Bij Ila (ði tret) 
= gij + Ê; (ir) (7.75) 
(dir) = gij + BjF* ire). (7.76) 


The last line cannot be simplified further; however, on substituting eqn. (7.76) 
into eqn. (7.74), it is straightforward to show that 


(air) — ê- B = AA+. B), (7.77) 
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and thus 
Îi 
(ðr) = —. (7.78) 
K 


This may now be substituted back into eqn. (7.75) to give 


Biri 
(ir) = gy +. (7.79) 
Continuing in the same fashion, one derives the following relations: 
Bi 
(dor) = ——— 
K 
i B 
(dor) = =— 
K 
Îi 
(dir) = — 
K 
Bi 
Gr) = gj +=. 
K 
Qi 
(dBi) = — 
K 
FQ; 
(0;8;) = -——— 
K 
1 i 
do(rK) = ~ (BF B —a-r) 
aj(re) = (1 — 8? +a- r) — bi, (1.80) 
K 


where we have defined a,, = 098, = (0, V/c”). The field strength tensor may 
now be evaluated. From eqn. (7.70) one has 


Fav = 0p Ay — Ay 


= q ðu pv = Ov Bu = (BO, oat bud) (rK) f (7.81) 
4T Eoc rk r2K2 
And, noting that 6) = —1 is a constant, we identify the three-dimensional 
electric and magnetic field vectors: 
E; = c Fio 
_ —4 | Bi _ (Bid0 — Bodj)(rK) 
~ Ameg | rk r2K2 
=q [| œa;  (Bi— Tp) 2 
= . f= ; 7.82 
Ar Eo È r?’ (or ae) ehh) 
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1 
Bi = >€ijkF jk 


2 
q E Ae 
Eijk = 


4T EC rK r2K2 


teint | + Pr (ora —6)| (7.83) 


A4megc r?k? 


= —€;jkfjEk 
ZEUK 


eG x E);. (7.84) 
C 


From these relations, it is clear that the magnetic field is perpendicular to both 
the light ray r and the electric field. The electric field can be written as a sum of 
two parts, usually called the radiation field and the near field: 


= q ai (Bi —fi)(a-P) 

Eirad = Artec E + iar | (7.85) 
= 4q [i-d BI 

Pinear = 4T Eoc | rK3 | l ce 


The near field falls off more quickly than the long-range radiation field. The 
radiation field is also perpendicular to the light ray r. Thus, the far-field electric 
and magnetic vectors are completely transverse to the direction of propagation, 
but the near-field electric components are not completely transverse except at 
very high velocities 6 ~ 1. Note that all of the vectors in the above expressions 
are assumed to be evaluated at the retarded time. 

Owing to their special relationship, the magnitude of the magnetic and electric 
components are equal up to dimensional factors: 


JE]? = c’ |B)’. (7.87) 


Finally, the rate of work or power expended by the field is given by Poynting’s 
vector, 
Si = €ijk Ej Hk 
= (uoc) €:jnEj(F x E) 
= €0C€ijk E j (Ekimi Em) 
S = —e9c(E- E). (7.88) 


7.4 Resonance phenomena and dampening fields 


In the interaction between matter and radiation, bound state transitions lead to 
resonances, or phenomena in which the strength of the response to a radiation 
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field is amplified for certain frequencies. Classically these special frequencies 
are the normal modes of vibration for spring-like systems with natural frequency 
wp; quantum mechanically they are transitions between bound state energy 
levels with a definite energy spacing wọ = (E2 — E — 1)/h. The examples 
which follow are all cases of one mathematical phenomenon which manifests 
itself in several different physical scenarios. We see how the unified approach 
reveals these similarities. 


7.4.1 Cherenkov radiation 


The radiation emitted by charged particles which move in a medium where 
the speed of light is less than the speed of the particles themselves is called 
Cherenkov radiation. The effect was observed by Cherenkov [25] and given 
a theoretical explanation by Tamm and Frank [127] within the framework 
of classical electrodynamics. The power spectrum of the radiation may be 
calculated with extraordinary simplicity using covariant field theory [122]. 
Using the covariant formulation in a material medium from section 21.2.4 
and adapting the expression in eqn. (5.118) for the Maxwell field, we have the 
Feynman Green function in the Lorentz—Feynman a = 1 gauge, given by 


—i k . [næ cee Me OG 
Dp(x, x’) = wa | dw sin (“Ix -x'I)e WOE 
(7.89) 
where n is the refractive index of the medium. Note that this index is assumed 
to be constant here, which is not the case in media of interest. One should 
really consider n = n(w). However, the expressions generated by this form will 
always be correct in w space for each value of œw, since the standard textbook 


assumption is to ignore transient behaviour (t-dependence) of the medium. We 
may therefore write the dissipation term as 


W = uot J (dx) (dx) Jt) Dru (x, xN a’), (7.90) 


and we are interested in the power spectrum which is defined by 


P 2 dW 
fæ ON Im 


. 7.91 
h dt 2) 


Substituting in expressions for Ta we obtain 


Hokr sin(*® |x — x'|) 


c2|x — x’| 


1 / 
ImW = -z5 | wawa ) 


x cos(w|t — r DIES, (7.92) 
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from which we obtain 


sin(“2|x — x’ 
P(@) = — [coax e cos(@|t — t'I) 
4r?n n*|x — x’| 


1 n? i 1 
x [ewe rS aT (x) Ji (x | - (7.93) 
The current distribution for charged particles moving at constant velocity is 


p = q6(x — Vt) 
Jİ = qv’ ê (x — vt); (7.94) 


thus we have 


2 lee) 
P(w,t) = J olok (1 : )/ sin(nBwt) cos(wt)dt 


m? c 7 n(@)? B2 


4 00 
0 np <1 
= 2 o 7.95 
T Hols b (1 o z) np a ( ) 


This is the power spectrum for Cherenkov radiation, showing the threshold 
behaviour at nê = 1. We have derived the Cherenkov resonance condition 
for charges interacting with electromagnetic radiation. The Cherenkov effect is 
more general than this, however. It applies to any interaction in which particles 
interact with waves, either transverse or longitudinal. 


7.4.2 Cyclotron radiation 


Cyclotron, or synchrotron, radiation is emitted by particles accelerated by a 
homogeneous magnetic field. Its analysis proceeds in the same manner as that 
for Cherenkov radiation, but with a particle distribution executing circular rather 
than linear motion. For the current, one requires 


p = q(x — Xo) 
Jİ = qvi (x — Xo), (7.96) 


where xo is the position of the charged particle. Since the electromagnetic field 
is not self-interacting, the Green function for the radiation field is not affected by 
the electromagnetic field in the absence of a material medium. In the presence 
of a polarizable medium, there is an effect, but it is small. (See the discussion 
of Faraday rotation.) 

The force on charges is 


= qFijv!, (7.97) 
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and, since this is always perpendicular to the motion, no work is done; thus the 
energy is a constant of the motion: 


dE 
—=0. 7.98 
EP (7.98) 


The generic equation of circular motion is 
dv; 
dt 

which, in this case, may be written as 


dv; 
= = 1/1 — pw xB), (7.100) 
m 


= (w x V);, (7.99) 


where p; = mv; /y 1 — B* and 6; = v;/c. Thus, the angular frequency of orbit 
is the Larmor frequency, 


B; Bj ° 
pees lapasi (7.101) 
m 


which reduces to the cyclotron frequency, we ~ eB/m, in the non-relativistic 
limit 6; —> 0. The radius of revolution is correspondingly 


lvi mep 
o  lq|By1- p? 


The primary difficulty in analysing this problem is a technical one associated 
with the circular functions. Taking boundary conditions such that the particle 
position is given by 


R= 


(7.102) 


xı(t) = Rcos(at) 
x2(t) = R sin(at) 


x3 (t) = 0, (7.103) 
one finds the velocity 
vit) = —Ro sin(@t) 
v(t) = Rwcos(at) 
u3(t) = 0. (7.104) 


This may be substituted into the current in order to evaluate the power spectrum. 
This is now more difficult: one can use an integral representation of the delta 
function, such as the Fourier transform; this inevitably leads to exponentials 
of sines and cosines, or Bessel functions. We shall not pursue these details of 
evaluation here. See ref. [121] for further study of this topic. 
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7.4.3 Landau damping 


Landau damping is the name given to the dissipative mixing of momenta in 
any particle field or plasma which interacts with a wave. The phenomenon of 
Landau damping is quite general and crops up in many guises, but it is normally 
referred to in the context of the interaction of a plasma with electromagnetic 
waves. In a collisionless plasma (no scattering by self-interaction), there is still 
scattering by the interaction of plasma with the ambient electromagnetic field, 
similar to the phenomenon of stimulated absorption/emission. However, any 
linear perturbation or source can cause the energy in one plasma mode to be re- 
channelled into other modes, thus mixing the plasma and leading to dissipation. 
All one needs is a linear interaction between the waves and the plasma field, and 
a resonant amplifier, which tends to exaggerate a specific frequency. 

In simple terms, a wave acts like a sinusoidal potential which scatters and 
drags the particle field. If the phase of the field is such that it strikes the upward 
slope of the wave, it is damped or reflected, losing energy. If the phase is such 
that the field ‘rolls down’ the downward slope of the wave, it is enhanced and 
gains energy. In arandom system, the average effect is to dissipate or to dampen 
the field so that all particles or field modes tend to become uniform. In short, 
Landau damping is the re-organization of energy with the modes of a field due 
to scattering off wavelets of another field. 

Let us consider an unbound particle displacement field with action 


1 | L-i ; | 
S=— fæ —-ms^ — J's; +, (7.105) 
Ox 2 


coupled through the current J; to the electromagnetic field. The position of a 
particle is 
xi =X + ix =x +5", (7.106) 
and its velocity is 
xi =v + dv. (7.107) 


The velocity of a free particle is constant until the instant of its infinitesimal 
perturbation by a wave, so we write 


xX = Ut, (7.108) 
so that 
k, x" = kixi — ot = kis! + (kv! — ot. (7.109) 


The perturbation is found from the solution to the equation of motion: 


si = favo" (x, x E; a”, (7.110) 
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or 
. Ei expi (kisi + (kiŭ —w)ttyt 
pif E Ra pier et w)t + yt) 
m ilki —w)ty 
. Ei, expi(kjs' + (kù —w)ttyt 
s = f Re Ep expi (kis! + (kiv! — o)t + yt) (7.111) 
m 


lik — wo) + y]? 


An infinitesimal regulating parameter, y, is introduced here in order to define a 
limit in what follows. This has causal implications for the system. It means that 
the field either grows from nothing in the infinite past or dissipates to nothing 
in the infinite future. This is reflected by the fact that its sign determines the 
sign of the work done. Eventually, we shall set y to zero. The work done by 
this interaction between the charged particle q and the electric field E' is q E;x'. 
The rate of work is 


d , l 
Is [Eix'] = q3 E; x' + Ev. (7.112) 


The two terms signify action and reaction, so that the total rate of work is zero, 
expressed by the total derivative. The second term is the rate of work done 
by the charge on the field. It is this which is non-zero and which leads to the 
dampening effect and apparent dissipation. Following Lifshitz and Pitaevskii 
[90], we calculate the rate of work per particle as follows, 


dw i 
— = Re v E' 
dt z 


= Re q(v' + 6u') E(t, x +s) 
= Re q(v' + bu')(E;(t, x) + Oj Ei(t, x)s/ +). (7.113) 


To first order, the average rate of work is thus 


d | | 
(= = Re q7; ((a;E;)s") + Re q(5v'E;) 


1 _ . | 
= 53101 OEDS + 5950 EF. (7.114) 
Here we have used the fact that 
ReA = EA + Ax) (7.115) 
and 


1 1 
(Re A - Re B) = 7(AB* + A*B) = >Re (AB), (7.116) 
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since terms involving A? and B? contain e*’ average to zero over time (after 


setting y —> 0). Substituting for si and ôv, we obtain 


(= q | iv;k bij | 
=) Se Ee PS += = 3 
dt 2m [ilk vi -—w) typ [ikkjv' —o) ty] 
(7.117) 


The tensor structure makes it clear that the alignment, k;, and polarization, E i of 
the electric wave to the direction of particle motion, v;, is important in deciding 
the value of this expression. Physically, one imagines a wave (but not a simple 
transverse wave) moving in direction k; and particles surfing over the wave in a 
direction given by v;. The extent to which the wave offers them resistance, or 
powers them along, decides what work is done on them. For transverse wave 
components, k’ E; = 0, the first term vanishes. From the form of eqn. (7.117) 
we observe that it is possible to write 


2. m 
(F)-£ ae, d | EY (knv") |; (7.118) 


dt | 2m "7 dlkiv;) Lfi(kivi — o) y] 
and, using 
lim = m ô(z) (7.119) 
yoo +y 
we have 
dw q? d 
— } = +— EjE; kjv')8(kjv! — 7.120 
(= ce omer ao e w). (7.120) 


To avoid unnecessary complication, let us consider the contribution to this 
which is most important in the dampening process, namely a one-dimensional 
alignment of k; and v;: 


dw Qn 
= p= |B A ô(kv — 7.121 
(= eli FTC sie ala w). (7.121) 
This expression is for one particle. For the whole particle field we must perform 
the weighted sum over the whole distribution, f (œ), giving the total rate of 
work: 


dW\ n a 
(T) -+7 |El * fœ) Ka aane w) 
2 
m 
2. . 
= =P pr sol (1.122) 


v=w/k 
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The integral over the delta function picks out contributions when the velocity 
of particles, v;, matches the phase velocity of the electromagnetic wave, w/k;. 
This result can now be understood either in real space from eqn. (7.114) or 
in momentum space from eqn. (7.122). The appearance of the gradient of the 
electric field in eqn. (7.114) makes it clear that the dissipation is caused as a 
result of motion in the potential of the electric field. Eqn. (7.122) contains 
df/dw, for frequencies where the phase velocity is in resonance with the 
velocity of the particles within the particle field; this tells us that particles with 
v < w/k gain energy from the wave, whereas v > w/k lose energy toit(y > 0). 
The electric field will be dampened if the shape of the distribution is such that 
there are more particles with v < w/k than with v > w/k. This is typical for 
long-tailed distributions like thermal distributions. 
This can be compared with the discussion in section 6.1.4. 


7.4.4 Laser cooling 


Another example of resonant scattering with many experimental applications is 
the phenomenon of laser cooling. This can be thought of as Landau damping for 
neutral atoms, using the dipole force as the breaking agent. We shall consider 
only how the phenomenon comes about in terms of classical fields, and sketch 
the differences in the quantum mechanical formulation. By now, this connection 
should be fairly familiar. The shift in energy of an electromagnetic field by virtue 
of its interaction with a field of dipoles moving at fractional speed A’ is the work 
done in the rest frame of the atom, 


AW = -5 f do, P(x) E(x) 


2 
= L J (dx’)do, E!(x)G5 (x, x) EA a”, (7.123) 
= ; 
where 
(1 = B')?8? — yd, + K)GE (x, x) = ôi cô, x’) (7.124) 


(see eqn. (2.88)), and therefore the dipole force F on each atom may be deduced 
from dW = F - dr. The imaginary part of the energy is the power exchanged 
by the electromagnetic field, which is related to the damping rate or here the 
cooling rate of the atoms. The force on an atom is the gradient of the real part 
of the work: 


2 x ; 
FP = E J do, 3; Ee J (dx’) ahde. (7.125) 


If we consider a source of monochromatic radiation interacting with the particle 
field (refractive index n’), 


EŻ (x) = Eie™ = Ej iton, (7.126) 
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Table 7.1. Doppler effect on momentum. 


Resonance-enhanced Parallel = Anti-parallel 


(diagonal) ki Bi >0 ki Bi <0 
wg > wo FB <0 F, Bi >0 
wg < wo Fpi > 0 Fpi <0 


where the frequency w is unspecified but satisfies k?°c? = nw”, then we have 


2 ix (k-+-k’)+ix! (k’-k 
rfa L Eon f do, ee 
2m —wz + w + iyo 
2 i2kx 
mee cs Expy f do, A S (7.127) 
2m -o + wo +iyo 


This expression contains forward and backward moving photons of fixed 
frequency, w, and wavenumber, k;. The sign of the force acting on the atoms 
depends on the frequency relative to the resonant frequency, wo, and we are 
specifically interested in whether the force acts to accelerate the atoms or 
decelerate them relative to their initial velocity. The fact that atoms in the 
particle field move in all directions on average means that some will expe- 
rience Doppler blue-shifted radiation frequencies and others will experience 
red-shifted frequencies, relative to the direction of photon wavevector, ki. In 
effect, the Doppler effect shifts the resonant peak above and below its stationary 
value making two resonant ‘side bands’. These side bands can lead to energy 
absorption. This is best summarized in a table (see table 7.1). 

As the velocity component, v' = pic, of a particle field increases, the value 
of 1 — p' ik, either increases (when Å and b! point in opposing directions) or 
decreases (when Å and £' point in the same direction). The component of 
velocity in the direction of the photons, EŻ, is given by k! Bi, and its sign has 
two effects. It can bring wg closer to or further from the resonant frequency, wo, 
thus amplifying or attenuating the force on the particles. The force is greater 
for those values which are closest to resonance. It also decides whether the sign 
of the force is such that it tends to increase the magnitude of Ý or decrease the 
magnitude of £'. It may be seen from table 7.1 that the force is always such as to 
make the velocity tend to a value which makes wg = wo. Thus by sweeping the 
value of w from a value just above resonance to resonance, it should be possible 
to achieve 6’ — 0. The lowest attainable temperature according to this simple 
model is limited by the value of ao. 

In order to reduce all of the components of the velocity to minimal values, it is 
desirable to bathe a system in crossed laser beams in three orthogonal directions. 
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Such laser beams are called optical molasses, a kind of quagmire for resonant 
particle fields. Clearly, systems with a low-frequency resonance are desirable in 
order to push the magnitude of 6' down to a minimum. The rate of energy loss 
is simply the damping constant, y. 


7.5 Hydrodynamics 


The study of the way in which bulk matter fields spread through a system is 
called hydrodynamics. Because it deals with bulk matter, hydrodynamics is a 
macroscopic, statistical discussion. It involves such ideas as flow and diffusion, 
and is described by a number of essentially classical phenomenological equa- 
tions. 


7.5.1 Navier-Stokes equations 


The Navier-Stokes equations are the central equations of fluid dynamics. They 
are an interesting example of a vector field theory because they can be derived 
from an action principle in two different ways. Fluid dynamics describes a 
stationary system with a fluid flowing through it. The velocity is a function 
of position and time, since the flow might be irregular; moreover, because the 
fluid flows relative to a fixed pipe or container, the action is not invariant under 
boosts. 


Formulation as a particle field Using a ‘microscopic’ formulation, we can treat 
a fluid as a particle displacement field without a restoring force (spring tension 
zero). We begin by considering such a field at rest: 


S= fæ {508 = 516d!) 3 (ais) + si (F; — ap) . (7.128) 


Notice the term linear in the derivative which is dissipative and represents the 
effect of a viscous frictional force (see section 7.2). 7 is the coefficient of 
viscosity. In this form, the equations have made an assumption which relates 
bulk and shear viscosity, leaving only a single effective viscosity. This is the 
form often used experimentally. Varying the action with respect to sê leads to 
the field equation 


—p 5i +n V? si + F; — 3P =0. (7.129) 
Or, setting v; = ŝi, 
pd; — nV? vi +3 P = F. (7.130) 


This is the equation of a velocity field at rest. In order to boost it into a moving 
frame, we could re-define positions by x; —> x; — v;t, but it is more convenient 
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to re-define the time coordinate to so-called retarded time (see section 9.5.2). 
With this transformation, we simple replace the time derivative for vi by 
d 


ne i3, 7.131 
dira 1 vu j ( ) 


This gives 


eee ee ee (7.132) 
dtre 

In fluid dynamics, this derivative is sometimes called the substantive derivative; 
it is just the total derivative relative to a moving frame. This transformation of 
perspective introduces a non-linearity into the equation which was not originally 
present. It arises physically from a non-locality in the system; i.e. the fact that 
the velocity-dependent forces at a remote point lead to a delayed effect on the 
velocity at local point. Put another way, the velocity at one point interacts 
with the velocity at another point because of the flow, just as in a particle 
scattering problem. In particle theory parlance, we say that the velocity field 
scatters off itself, or is self-interacting. It would have been incorrect to apply 
this transformation to the action before variation since the action is a scalar and 
was not invariant under this transformation, thus it would amount to a change 
of the physics. Since the action is a generator of constraints, it would have 
additional consequences for the system, as we shall see below. 


Formulation as an effective velocity field The description above is based upon 
a microscopic picture of a fluid as a collection of particles. We need not think 
like this, however. If we had never built a large enough microscope to be able to 
see atoms, then we might still believe that a fluid were a continuous substance. 
Let us then formulate the problem directly in terms of a velocity field. We may 
write the action 


S=t fæ |- eo, ae TOON — v (F; — aP] . (7.133) 


The constant scale t has the dimensions of time and is necessary on purely 
dimensional grounds. The fact that we need such an arbitrary scale is an 
indication that this is just an average, smeared out field theory rather than a 
microscopic description. It has no physical effect on the equations of motion 
unless we later attempt to couple this action to another system where the same 
scale is absent or different. Such is the nature of dimensional analysis. The 
linear derivatives in the action are symmetrized for the reasons discussed in 
section 4.4.2. Varying this action with respect to the velocity v’, and treating p 
as a constant for the time being, leads to 


p ivi —n V? vi + 0;P = F;. (7.134) 
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Changing to retarded time, as before, we have the Navier-Stokes equation, 
p dv; + pv! (d;v;) — N V? vi +9;P = F. (7.135) 


Again, it would be incorrect to transform the action before deriving the field 
equations, since the action is a scalar and it is not invariant under this transfor- 
mation. 

Consider what would have happened if we had tried to account for the 
retardation terms in the action from the beginning. Consider the action 


1 _o 1 ; > 1 ni 
S=t fol- v’ ð vi + 5P v' (0,0) 07 — 5P 0;(v' v!)v; 
1 A . 
+ 51(8'v!) (8:04) = v (F; = apy}. (7.136) 


The action is now non-linear from the beginning since it contains the same 
retardation information as the transformed eqn. (7.132). The derivatives are 
symmetrized also in spatial directions. The variation of the action is also more 
complicated. We shall now let po depend on x. After some calculation, variation 
with respect to vÍ leads to an equation which can be separated into parts: 
f 1 2 
(3p)vi + puj(djv’) + z io) =0 
p (rvi) + pv/djuj — n V? vi + 3P = F. (7.137) 


The first of these occurs because the density is no longer constant; it is 
tantalizingly close to the conservation equation for current 


—ð; p = d;(pv'), (7.138) 


but alas is not quite correct. The equations of motion (7.137) are almost the 
same as before, but now the derivative terms are not quite correct. Instead of 


viðjvi (7.139) 
we have the symmetrical 
v! d;0;. (7.140) 


This result is significant. The terms are not unrelated. In fact, since we can 
always add and subtract a term, it is possible to relate them by 


viðjvi = v’ (0;v;) + vi (0; 0; = 0;U;). (7.141) 


The latter term is the curl of the velocity. What this means is that the two terms 
are equivalent provided that the curl of the velocity vanishes. It vanishes in 
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the absence of eddies or other phenomena which select a preferred direction 
in space or time. This is indicative of the symmetry of the action. Since the 
action was invariant under space and time reversal, it can only lead to equations 
of motion with the same properties. Physically, this restriction corresponds to 
purely irrotational flow. Notice how the symmetry which is implicit in the action 
leads directly to a symmetry in the field equations. The situation was different 
in our first formulation, where we chose to transform the action to retarded time 
(an intrinsically asymmetrical operation). 

The problem of an x-dependent density pọ is not resolvable here. The 
fundamental problem is that the flow equation is not reversible, whereas the 
action would like to be. If we omit the non-linear terms, the problem of 
finding an action which places no restriction on p is straightforward, though 
not particularly well motivated. We shall not pursue this here. The lesson 
to be learned from this exercise is that, because the action is a scalar, the 
action principle will always tend to generate field equations consistent with the 
symmetries of the fields it is constructed from. Here we have tried to generate 
a term v/ 0;v; from an action principle, but the infinitesimal variation of this 
term led to new constraints since action is spacetime-reflection-invariant. The 
problem of accommodating an x-dependent density is confounded by these 
other problems. In short, non-covariant analyses do not lend themselves to a 
covariant formulation, but should be obtained as a special case of a more well 
defined problem as in the first method. 


7.5.2 Diffusion 


Let us consider the rate at which conserved matter diffuses throughout a system 
when unencumbered by collisions. Consider a matter current, J,,, whose 
average, under the fluctuations of the system, is conserved: 


ð (J) = 0. (7.142) 


We need not specify the nature of the averaging procedure, nor the origin of the 
fluctuations here. Phenomenologically one has a so-called constitutive relation 
[53], which expresses a phenomenological rate of flow in terms of local density 
gradients: 


(Ji) = — Dô; (p). (7.143) 
Substituting this into the conservation equation gives 
(a, — DV*)(p) =0. (7.144) 


This is a diffusion equation, with diffusion coefficient D. If we multiply this 
equation by the positions squared, x”, and integrate over the entire system, 


fa x(a, — DV?) (p) =0, (7.145) 
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we can interpret the diffusion constant in terms of the mean square displacement 
of the field. Integrating by parts, and assuming that there is no diffusion at the 
limits of the system, one obtains 


a,(x*) —-2D ~ 0, (7.146) 
or 
(x?) ~ 2Dt, (7.147) 


which indicates that particles diffuse at a rate of 2D metres per unit time. 
Notice that, since D characterizes the diffusion of averaged quantities, it need 
not be a constant. We shall think of it as a slowly varying function of space and 
time. The variation, however, must be so slow that it is effectively constant over 
the dominant scales of the system. We shall derive a Kubo-type relation for this 
quantity [53]. 

From eqn. (7.144), we may solve 


(ox) = = etx- 5 (k) §(—iw — Dk’), (7.148) 
or 
GO (k) = =e (7.149) 
Thus 
o= f Fae o), (7.150) 


To determine the effect of fluctuations in this system, consider adding an 
infinitesimal source, 


(8, — DV’)(p) =F. (7.151) 


The purely mechanical retarded response to F gives us the following relation: 


(phx) = fe’) Gl. x VFO, (7.152) 
where the retarded Green function may be evaluated by analogy with eqn. (5.77) 
G(x, x’) = d"kdw eilk-x—ot) 1 ZA i 1 i 
(2m)”+! o+iDk?—ie w—iDk? +ie 
d"kdw ; —2i Dk? 
= | Sciex on (7.153) 


(Qn yr+ (@ — ie)? + D2k*" 
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From eqn. (6.67) we have 


(p) = > [axe FO, (7.154) 


where 


2 


i f -W : 3 
z oO) y= FFI = —ilmGp(x, x’). (7.155) 


The Feynman Green function may be evaluated using the phase, or weight 
exp(iS/f), by analogy with eqn. (5.95): 


' d'kdw ixo) 1 1 
GFX, x) = ie 555s a. ee 
(27 rt œw +iDk? — ie w—iDk* +ie 
= d'kdw ikx-or) —2i Dk? (7.156) 
J (nytt œ? + D?k4 — ie’ , 


For thermal or other distributions it will be somewhat different. We may now 
compare this (in momentum space) with the linear response equation: 

(pe) (k) = ImGr(k) F sei F (7.157) 

= Im = P A 

= j œ? + D?k* 
Thus, eliminating the source from both sides of this equation, we may define 
the instantaneous ‘D.C? (œ — 0) diffusion constant, given by the Kubo-type 
relation, 


2 
(Dw > 0)) = lim (im a Gr) . (7.158) 


If we take Gr from eqn. (7.156), we see the triviality of this relation for purely 
collisionless quantum fluctuations of the field, (o). By taking the fluctuation 
average to be exp(iS/h), we simply derive a tautology. However, once we 
switch on thermal fluctuations or quantum interactions (for which we need to 
know about quantum field theory), the Feynman Green function picks up a 
temperature dependence and a more complicated analytical structure, and this 
becomes non-trivial; see eqn. (6.61). Then it becomes possible to express D in 
terms of independent parameters, rather than as the phenomenological constant 
in eqn. (7.143). 


7.5.3 Forced Brownian motion 


A phenomenological description of Brownian motion for particles in a field is 
given by the Langevin model. Newton’s second law for a particle perturbed by 
random forces may be written in the form 
dv! 
m 
dt 


= Fİ — avi, (7.159) 
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where vi is the velocity of a particle in a field and œ is a coefficient of 
friction, by analogy with Stokes’ law. This equation clearly expresses only a 
statistical phenomenology, and it cannot be derived from an action principle, 
since it contains explicitly velocity-dependent terms, which can only arise 
from statistical effects in a real dynamical system. The forcing term, F’, is a 
random force. By this, we mean that the time average of this force is zero, 
i.e. it fluctuates in magnitude and direction in such a way that its time average 
vanishes: 


1 t+T/2 
(F(t)) = =f F(t)dt = 0. (7.160) 
T-T/2 


We may solve this equation simply, in the following ways. 
Green function approach Consider the general solution of 
du 
a_ TETO, (7.161) 


where a and b are positive constants. Using the method of Green functions, we 
solve this in the usual way. Writing this in operator/source form, 


d 
(«5 + ») u = f(t), (7.162) 
we have the formal solution in terms of the retarded Green function 
u(t) = fJ woa, ty f(t’), (7.163) 
where 
d $ / 
($ +0) G,(t,t) = Ut). (7.164) 
Taking the Fourier transform, we have 
dæ eiet") 
G,(t —t') = | ———. 7.165 
( ) J 27 (—iaw + b) ( ) 
This Green function has a simple pole for t — t’ > 0 at œ = —ib/a, and the 


contour is completed in the lower half-plane for wœ, making the semi-circle at 
infinity vanish. The solution for the field u(t) is thus 


) = fa {Z eT iet— T) 
ul Clar Caor 
a arzi 2ni(— ealt- aro) 
wien An —ia 


Efe Brg 
-f dt f(r)ea", (7.166) 
a J_ 


[0,0] 
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The lower limit of the integral is written as minus infinity since we have not 
specified the time at which the force was switched on, but we could replace this 
by some finite time in the past by specifying boundary conditions more fully. 


Differential equation approach Although the Green function method is straight- 
forward and quite simple, this eqn. (7.161) can also be solved by an alternative 
method. When f(t) = 0 it is solved by separation of variables, giving 


du _ b 
dt a 
u(t) = uge, (1.167) 


for some constant up. This is therefore the complementary function for the 
differential equation. If the forcing term f (t) is non-zero, this hints that we can 
make the equation integrable by multiplying through by the integrating factor 


exp(—bt/a). 
d / b, _ 1 du by 
T (e? u(t)) z7 (3 + bu(t)) e 
ea'u(t) = -f dt HOH 
0 
u(t) = a dr f(e”, (7.168) 
0 


This is exactly analogous to making a gauge transformation in electrodynamics. 
Note that, since the integral limits are from 0 to t, u(t) cannot diverge unless 
f(t) diverges. The lower limit is by assumption. The general solution to 
eqn. (7.161) is therefore given by the particular integral in eqn. (7.168) plus 
an arbitrary constant times the function in eqn. (7.167). The solutions are 
typically characterized by exponential damping. This reproduces the answer 
in eqn. (7.166) marginally more quickly than the tried and trusted method of 
Green functions. This just goes to show that it never does any harm to consider 
alternative methods, even when in possession of powerful methods of general 
applicability. 


Diffusion and mobility Langevin’s equation plays a central role in the kinetic 
theory of diffusion and conduction. Let x’ = v’, then, multiplying through by 
x, we have 


mx— =m Fa - 3] = -—axx+xF(t). (7.169) 
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Taking the kinetic (ensemble) average of both sides, and recalling that the 
fluctuating force has zero average, we have that 


d d 
m{ Z6) = m5 (a) = kT —a(xx), (7.170) 
where we have used the result from kinetic theory (the equi-partition theorem) 
that +m(x") = SkT. We can solve this to give 
kT 
(xx) =Cew/™ 4 (7.171) 
a 


At large times, the first of these terms decays and the system reaches a steady 
state. We may integrate this to give 
2 2kT 
(x^) = —t. (7.172) 
a 
This tells us the mean square position. By comparing this to the diffusion 
equation in eqn. (7.146) we find the effective diffusion coefficient 


kT 

D = —. (7.173) 
a 

A related application is that of electrical conduction. Consider the same 

diffusion process for charges e in a uniform electric field E. The average of 


the Langevin equation is now 
d(v') 
m 
dt 


since (F) = 0. In a steady state, the average acceleration is also zero, even 
though microscopically there might be collisions which cause fluctuations in 
the velocity. Thus we have, at steady state, 


= eE' —a(v’), (7.174) 


eE =alv’). (7.175) 
We define the mobility, u, of the charges, for an isotropic system, as 


ow) e 
Wap =e. (7.176) 


The mobility is related to the diffusion constant by the Einstein relation 


es, Se (7.177) 
D kT 
In an anisotropic system, there might be different coefficients for diffusion and 
mobility in different directions. Then, eqn. (7.176) would become a tensor 


relation, bij Ev! /E;. 
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7.6 Vortex fields in 2 + 1 dimensions 


Although one generally avoids speaking of particulate matter in field theory, 
since classically it is used to describe mainly smooth, continuous fields, there are 
occasions on which the solutions to the equations of motion lead unambiguously 
to pointlike objects. One such situation is the case of vortices. 

Vortices are charged, singular objects which arise in some physical systems 
such as the non-linear Schrödinger equation. Vortices have the property that 
they acquire a phase factor, by an Aharonov-Bohm -like effect, when they wind 
around one another. They can usually be thought of as pointlike objects which 
are penetrated by an infinitely thin line of magnetic flux. In 2 + 1 dimensions, 
vortices are also referred to as anyons, and have a special relationship with 
Chern—Simons field theories. It might seem strange that a field variable @(x), 
which covers all of space and time, could be made to represent such singular 
objects as vortices. As we shall see in the following example, this is made 
possible precisely by the singular nature of Green functions. 

Consider a field, #(x), representing pointlike objects in two spatial dimen- 
sions with coordinates denoted for simplicity by r = (x, y). We define the 
winding angle, 0, between any two pointlike objects in the field by 


y-y 


F 


A 
6(r —r’) = tan! AY tan! 
Ax 


(7.178) 
x—x 
Notice that 0 (r — r’) is a function of coordinate differences between pairs of 
points. We shall, in fact, relate this winding angle to the Green function g(x, x’), 
for the Laplacian in two dimensions, which was calculated in section 5.4.4. 


7.6.1 A vortex model 


The study of Chern—Simons theories is motivated principally by two observa- 
tions: namely that important aspects of the quantum Hall effect are described 
by a Chern—Simons theory, and that a viable theory of high-temperature super- 
conductivity should be characterized by a parity-violating, anti-ferromagnetic 
state. Symmetry considerations lead to an action which does not possess 
space-reflection symmetry. The Chern—Simons action fits this prescription. 
These two physical systems are also believed to be essentially two-dimensional, 
planar systems. 

In its most primitive form, the action for the Chern—Simons model may be 
written in (2 + 1) dimensional flat spacetime as 


À 1 
S= f| ux (oaoa) +m’? + Ze + uean) , 
(7.179) 
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The equation of motion is thus 


1 
swe Fun =J". (7.180) 


The gauge-invariant current, J”, is introduced for convenience and represents 
the interaction with the matter fields arising from the gauge-covariant derivatives 
in eqn. (7.179). We shall not consider the full dynamics of this theory here; 
rather, it is interesting to see how the singular vortex phenomenon is reflected in 
the field variables. 


7.6.2 Green functions 


The basic Green function we shall use in the description of two-dimensional 
vortices is the inverse Laplacian which was derived in section 5.4.4, but it 
is also useful to define and elaborate on some additional symbols which are 
encountered in the literature. We shall use the symbol r’ as an abbreviation for 
the coordinate difference Ar’ = Ax’ = xÝ — x”, and the symbol Ar for the 
scalar length of this vector. Some authors define a Green function vector by 


Gir —r') =e; gr- r’) 
1, P 


where 7 is a unit vector along r — r’. The two-dimensional curl of this function 
is thus 


V x G(r) = 3G; —r') 
= ee A(T — r’) 
= —V’g(r —r') 
= ô(r =r’). (7.182) 


In other words, GÏ (r — r’) is the inverse of the curl operator. 


7.6.3 Relationship between 0 (r — r’) and g(r — r") 


To obtain a relationship between the coordinates and the winding function 0 (r), 
we note that 


; sinO(r — r’) 
ð; tan 0 (r — r) = ð; | —— 


cos O(r — r’) 
= 0,0(r —r’) sec? O(r —r’) 
= gð (r — r^ + tan? O(r — r"). (7.183) 
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From eqn. (7.178), this translates into 


Ay 
a (33) 
o0 = 


5 nse 
(8) 
_ Ax(d;Ay) — Ay (ð; Ax) 


r2 


fi 


= -ej (7.184) 


This last form is significant since the logarithm has a similar property, namely 


A 


eia njr -r'| = #2, (7.185) 
r 


—r' 


and thus we immediately have the relationship: 
1 
Tas Ee —r’)) = G(r) = -69;g(r — r’). (7.186) 
= Pe, 


It is understood that partial derivatives acting on r — r’ act on the first argument 
r. 


7.6.4 Singular nature of 0 (r — r') 


The consistency of the above relations supplies us with an unusual, and perhaps 
somewhat surprising relation, namely 


eaa; — r") = 2x ô r’) (7.187) 
or 
[31, 32]0 (r — r^) = 2xô(r — r’). (7.188) 


This relation tells us that the partial derivatives do not commute when acting on 
the function 0 (r). This is the manifestation of a logarithmic singularity in the 
field, or, physically, the non-triviality of the phase accrued by winding vortices 
around one another. Although the field is formally continuous, it has this non- 
analytical property at every point. 

Using complex coordinates z = x! + ix? and conjugate variables Z, the above 
discussion leads to the relations in complex form: 


3T! = 0,85 ln |z|? 
= 18(|z|). (7.189) 


Part 2 
Groups and fields 


8 


Field transformations 


The previous chapters take a pragmatic, almost engineering, approach to the 
solution of field theories. The recipes of chapter 5 are invaluable in generating 
solutions to field equations in many systems, but the reason for their effective- 
ness remains hidden. This chapter embarks upon a train of thought, which lies 
at the heart of the theory of dynamical systems, which explain the fundamental 
reasons why field theories look the way they do, how physical quantities are 
related to the fields in the action, and how one can construct theories which give 
correct answers regardless of the perspective of the observer. Before addressing 
these issues directly, it is necessary to understand some core notions about 
symmetry on a more abstract level. 


8.1 Group theory 


To pursue a deeper understanding of dynamics, one needs to know the language 
of transformations: group theory. Group theory is about families of transforma- 
tions with special symmetry. The need to parametrize symmetry groups leads 
to the idea of algebras, so it will also be necessary to study these. 

Transformations are central to the study of dynamical systems because all 
changes of variable, coordinates or measuring scales can be thought of as 
transformations. The way one parametrizes fields and spacetime is a matter of 
convenience, but one should always be able to transform any results into a new 
perspective whenever it might be convenient. Even the dynamical development 
of a system can be thought of as a series of transformations which alter the 
system’s state progressively over time. The purpose of studying groups is 
to understand the implications posed by constraints on a system: the field 
equations and any underlying symmetries — but also the rules by which the 
system unfolds on the background spacetime. In pursuit of this goal, we shall 
find universal themes which enable us to understand many structures from a few 
core principles. 
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8.1.1 Definition of a group 


A group is a set of objects, usually numbers or matrices, which satisfies the 
following conditions. 


(1) There is a rule of composition for the objects. When two objects in a 
group are combined using this rule, the resulting object also belongs to 
the group. Thus, a group is closed under the action of the composition 
rule. If a and b are two matrices, then a -b Æ b -a is not necessarily true. 
Ifa-b = b-a, the group is said to be Abelian, otherwise it is non-Abelian. 


(2) The combination rule is associative, i.e. (a-b)-c=a-(b-c). 


(3) The identity element belongs to the set, i.e. an object which satisfies 
a-IT=a. 


(4) Every element a in the set has a right-inverse a~!, such that a~! - a = 1. 


A group may contain one or more sub-groups. These are sub-sets of the whole 
group which also satisfy all of the group axioms. Sub-groups always overlap 
with one another because they must all contain the identity element. Every 
group has two trivial or improper sub-groups, namely the identity element and 
the whole group itself. The dimension of a group dg is defined to be the 
number of independent degrees of freedom in the group, or the number of 
generators required to represent it. This is most easily understood by looking 
at the examples in the next section. The order of a group Og is the number of 
distinct elements in the group. In a continuous group the order is always infinite. 


If the ordering of elements in the group with respect to the combination rule 
matters, i.e. the group elements do not commute with one another, the group is 
said to be non-Abelian. In that case, there always exists an Abelian sub-group 
which commutes with every element of the group, called the centre. Schur’s 
lemma tells us that any element of a group which commutes with every other 
must be a multiple of the identity element. The centre of a group is usually a 
discrete group, Zy, with a finite number, N, of elements called the rank of the 


group. 


8.1.2 Group transformations 


In field theory, groups are used to describe the relationships between compo- 
nents in a multi-component field, and also the behaviour of the field under 
spacetime transformations. One must be careful to distinguish between two 
vector spaces in the discussions which follow. It is also important to be very 
clear about what is being transformed in order to avoid confusion over the 
names. 
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e Representation space. This is the space on which the group trans- 
formations act, or the space in which the objects to be transformed 
live. In field theory, when transformations relate to internal symmetries, 
the components of field multiplets ($1, $2,..., Qag) are the coordinates 
on representation space. When transformations relate to changes of 
spacetime frame, then spacetime coordinates are the representation space. 


e Group space. This is an abstract space of dimension dg. The dimension 
of this space is the number of independent transformations which the 


group is composed of. The coordinates (6), 62, ..., Oag) in this space are 
measured with respect to a set of basis matrices called the generators of 
the group. 


Since fields live on spacetime, the full representation space of a field consists 
of spacetime (u, v indices) combined with any hidden degrees of freedom: spin, 
charge, colour and any other hidden labels or indices (all denoted with indices 
A, B,a,b, a, B) which particles might have. In practice, some groups (e.g. the 
Lorentz group) act only on spacetime, others (e.g. SU (3)) act only on hidden 
indices. In this chapter, we shall consider group theory on a mainly abstract 
level, so this distinction need not be of concern. 

A field, (x), might be a spacetime-scalar (i.e. have no spacetime indices), 
but also be vector on representation space (have a single group index). 


fi (x) 


p(x) 


ġ&x)a = (8.1) 


T 


The transformation rules for fields with spacetime (coordinate) indices are 
therefore 


$> g 
Ay > U,’ Ay 
Suv = Uru’ Spa» (8.2) 
and for multiplet transformations they are 
p^ > Uas $” 
a b 
An > Ua A, 
gis > UaB 8i, (8.3) 


All of the above have the generic form of a vector v with Euclidean components 


v4 = va transforming by matrix multiplication: 


v—> Uv, (8.4) 
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or 
vt = U4 v. (8.5) 


The label A = 1, ..., dr, where dp is the dimension of the representation. Thus, 
the transformation matrix U is a dg x dgr matrix and v is a dg-component column 
vector. The group space is Euclidean, so raised and lowered A, B indices are 
identical here. 

Note that multiplet indices (those which do not label spacetime coordinates) 
for general group representations G pg are labelled with upper case Latin charac- 


ters A, B = 1,...,dpgr throughout this book. Lower case Latin letters a,b = 
1,...,dg are used to distinguish the components of the adjoint representation 
Gagj- 


In general, the difference between a representation of a group and the group 
itself is this: while a group might have certain unique abstract properties which 
define it, the realization of those properties in terms of numbers, matrices or 
functions might not be unique, and it is the explicit representation which is 
important in practical applications. In the case of Lie groups, there is often a 
variety of possible locally isomorphic groups which satisfy the property (called 
the Lie algebra) that defines the group. 


8.1.3 Use of variables which transform like group vectors 


The property of transforming a dynamical field by simple matrix multiplication 
is very desirable in quantum theory where symmetries are involved at all 
levels. It is a direct representation of the Markov property of physical law. In 
chapter 14, it becomes clear that invariances are made extremely explicit and 
are algebraically simplest if transformation laws take the multiplicative form in 
eqn. (8.5). 

An argument against dynamical variables which transform according to group 
elements is that they cannot be observables, because they are non-unique. 
Observables can only be described by invariant quantities. A vector is, by 
definition, not invariant under transformations; however, the scalar product of 
vectors is invariant. 

In classical particle mechanics, the dynamical variables q(t) and p(t) do 
not transform by simple multiplication of elements of the Galilean symmetry. 
Instead, there is a set of eqns. (14.34) which describes how the variables change 
under the influence of group generators. Some would say that such a formulation 
is most desirable, since the dynamical variables are directly observable, but the 
price for this is a more complicated set of equations for the symmetries. 

As we shall see in chapter 14, the quantum theory is built upon the idea that 
the dynamical variables should transform like linear combinations of vectors on 
some group space. Observables are extracted from these vectors with the help 
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of operators, which are designed to pick out actual data as eigenvalues of the 
operators. 


8.2 Cosets and the factor group 
8.2.1 Cosets 


Most groups can be decomposed into non-overlapping sub-sets called cosets. 
Cosets belong to a given group and one if its sub-groups. Consider then a group 
G of order Og, which has a sub-group H of order Oy. A coset is defined by 
acting with group elements on the elements of the sub-group. In a non-Abelian 
group one therefore distinguishes between left and right cosets, depending on 
whether the group elements pre- or post-multiply the elements of the sub-group. 
The left coset of a given group element is thus defined by 


GH = {GH,,Gm,..., GHa,} (8.6) 
and the right coset is defined by 
HG ={H\G, MG, ... , Ha,G}. (8.7) 


The cosets have order Oy and one may form a coset from every element of G 
which is not in the sub-group itself (since the coset formed by a member of the 
coset itself is simply that coset, by virtue of the group axioms). This means that 
cosets do not overlap. 

Since cosets do not overlap, one can deduce that there are Og — Oy distinct 
cosets of the sub-group. It is possible to go on forming cosets until all these 
elements are exhausted. The full group can be written as a sum of a sub-group 
and all of its cosets. 


G=H+G,H+G2,H+.---+GpH, (8.8) 


where p is some integer. The value of p can be determined by counting the 
orders of the elements in this equation: 


Og = Oy + Oq + Og + +--+ On = (p + 1) Ox. (8.9) 
Thus, 
Og = (p + 1) On. (8.10) 


Notice that the number of elements in the sub-group must be a factor of the 
number of elements in the whole group. This is necessarily true since all cosets 
are of order Oy. 
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8.2.2 Conjugacy and invariant sub-groups 


If gı is an element of a group G, and g> is another element, then g, defined by 


ge = 82885! (8.11) 


is said to be an element of the group G which is conjugate to gı. One can form 
conjugates from every other element in the group. Every element is conjugate 
to itself since 


ae a ioe (8.12) 


Similarly, all elements in an Abelian group are conjugate only to themselves. 
Conjugacy is a mutual relationship. If gı is conjugate to g2, then go is conjugate 
to g1, since 


£1=8 828 | 


gĮg=g' gg]. (8.13) 


If gı is conjugate to go and g% is conjugate to g3, then gı and g3 are also 
conjugate. This implies that conjugacy is an equivalence relation. 

Conjugate elements of a group are similar in the sense of similarity transfor- 
mations, e.g. matrices which differ only by a change of basis: 


A =AMA"!. (8.14) 


The conjugacy class of a group element g is the set of all elements conjugate to 
g: 


[Ig I, g8 8, 82883}. (8.15) 


A sub-group H of G is said to be an invariant sub-group if every element of the 
sub-group is conjugate to another element in the sub-group: 


H. = G H G7! = H. (8.16) 


This means that the sub-group is invariant with respect to the action of the group, 
or that the only action of the group is to permute elements of the sub-group. It 
follows trivially from eqn. (8.16) that 


GH = HG, (8.17) 


thus the left and right cosets of an invariant sub-group are identical. This means 
that all of the elements within H commute with G. H is said to belong to the 
centre of the group. 
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8.2.3 Schur’s lemma and the centre of a group 


Schur’s lemma states that any group element which commutes with every other 
element of the group must be a multiple of the identity element. This result 
proves to be important in several contexts in group theory. 


8.2.4 The factor group G/H 


The factor group, also called the group of cosets is formed from an invariant 
sub-group H of a group G. Since each coset formed from H is distinct, one can 
show that the set of cosets of H with G forms a group which is denoted G/H. 
This follows from the Abelian property of invariant sub-groups. If we combine 
cosets by the group rule, then 


Hg: Hg, = H H g1 g2 = H (81 - 82,) (8.18) 
since H - H = H. The group axioms are satisfied. 
(1) The combination rule is the usual combination rule for the group. 


(2) The associative law is valid for coset combination: 
(Hg -Hg2)-Hg3 = H(g1- 82) -Hg3 = H ((81 - 82) - g3). (8.19) 


(3) The identity of G/H is H - I. 
(4) The inverse of Hg is Hg™!. 


The number of independent elements in this group (the order of the group) is, 
from eqn. (8.10), p + 1 or Og/Op. Initially, it might appear confusing from 
eqn. (8.7) that the number of elements in the sub-group is in fact multiplied 
by the number of elements in the group, giving a total number of elements in 
the factor group of Og x Oy. This is wrong, however, because one must be 
careful not to count cosets which are similar more than once; indeed, this is 
the point behind the requirement of an invariant sub-group. Cosets which are 
merely permutations of one another are considered to be equivalent. 


8.2.5 Example of a factor group: SU (2)/ Z2 


Many group algebras generate groups which are the same except for their 
maximal Abelian sub-group, called the centre. This virtual equivalence is 
determined by factoring out the centre, leaving only the factor group which 
has a trivial centre (the identity); thus, factor groups are important in issues 
of spontaneous symmetry breaking in physics, where one is often interested in 
the precise group symmetry rather than algebras. As an example of a factor 
group, consider SU (2). The group elements of SU (2) can be parametrized in 


176 & Field transformations 


terms of dg = 3 parameters, as shown in eqn. (8.131). There is a redundancy in 
these parameters. For example, one can generate the identity element from each 
of the matrices gı (01), 22(2), 23(03) by choosing 04 to be zero. 

A non-trivial Abelian sub-group in these generators must come from the 
diagonal matrix g3(63). Indeed, one can show quite easily that g3 commutes with 
any of the generators for any 04 Æ O, if and only if exp(i563) = exp(—i463) = 
+1. Thus, there are two possible values of 03, arising from one of the generators; 
these lead to an Abelian sub-group, and the group elements they correspond to 


are: 
1 0 -1 0 
wa ae oe (8.20) 


which form a 2 x 2 representation of the discrete group Z2. This sub-group is 
invariant, because it is Abelian, and we may therefore form the right cosets of 
H for every other element of the group: 


ae oe 
H - g1(81) = {21 (01) , —81(1)} 
H - g81(01) = {81(9;) . —210D} 
H - g0) = {810 , ~g 0) 


H - g2(02) = {82(02) , —82(62)} 
H - g2(03) = {82(05) , —82(02)} 


H - g3(03) = {83(03) , —82(03)} 
(8.21) 


The last line is assumed to exclude the members of g3, which generate H, and 
the elements of g; and go, which give rise to the identity in Z2, are also excluded 
from this list. That is because we are listing distinct group elements rather than 
the combinations, which are produced by a parametrization of the group. 

The two columns on the right hand side of this list are two equivalent copies 
of the factor group SU (2)/Z2. They are simply mirror images of one another 
which can be transformed into one another by the action of an element of Z2. 
Notice that the full group is divided into two invariant pieces, each of which has 
half the total number of elements from the full group. The fact that these coset 
groups are possible is connected with multiple coverings. In fact, it turns out 
that this property is responsible for the double-valued nature of electron spin, 
or, equivalently, the link between the real rotation group SO (3) (dg = 3) and 
the complexified rotation group, SU (2) (dg = 3). 
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8.3 Group representations 


A representation of a group is a mapping between elements of the group and 
elements of the general linear group of either real matrices, GL(n, R), or 
complex matrices, GL(n, C). Put another way, it is a correspondence between 
the abstract group and matrices such that each group element can be represented 
in matrix form, and the rule of combination is replaced by matrix multiplication. 


8.3.1 Definition of a representation Gp 


If each element g of a group G can be assigned a non-singular dg x dr matrix 
Ur(g), such that matrix multiplication preserves the group combination rule 


812 = 81° 82; 
Ur(g12) = Ur(81 - 82) = Ur(81) Ur(82), (8.22) 


then the set of matrices is said to provide a dg dimensional representation of 
the group G. The representation is denoted collectively Gr and is composed 
of matrices Ug. In most cases we shall call group representations U to avoid 
excessive notation. 


8.3.2 Infinitesimal group generators 


If one imagines a continuous group geometrically, as a vector space in which 
every point is a new element of the group, then, using a set of basis vectors, it is 
possible to describe every element in this space in terms of coefficients to these 
basis vectors. Matrices too can be the basis of a vector space, which is why 
matrix representations are possible. The basis matrices which span the vector 
space of a group are called its generators. 

If one identifies the identity element of the group with the origin of this 
geometrical space, the number of linearly independent vectors required to reach 
every element in a group, starting from the identity, is the dimension of the 
space, and is also called the dimension of the group dg. Note that the number 
of independent generators, dg, is unrelated to their size dg as matrices. 

Thus, given that every element of the group lies in this vector space, an 
arbitrary element can be described by a vector whose components (relative to the 
generator matrices) uniquely identify that element. For example, consider the 
group SU (2), which has dimension dg = 3. In the fundamental representation, 
it has three generators (the Pauli matrices) with dr = 2: 


1/0 1 1/0 -i 1/1 0 
a (P 2): am 
A general point in group space may thus be labelled by a dg dimensional vector 


(81, 02, 03): 
© = 6, T; + 0) To +6 T3. (8.24) 
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A general element of the group is then found by exponentiating this generalized 
generator: 


Ur = exp(iQ). (8.25) 


Up is then a two-dimensional matrix representation of the group formed from 
two-dimensional generators. Alternatively, one may exponentiate each gener- 
ator separately, as in eqn. (8.131) and combine them by matrix multiplication 
to obtain the same result. This follows from the property that multiplication of 
exponentials leads to the addition of the arguments. 

For continuous groups generally, we can formalize this by writing a Taylor 
expansion of a group element U (0) about the identity 7 = U (0), 


da aU 
U@s)= 7 Oa (3) 


A=1 


Ay (8.26) 


64=0 


where dg is the dimension of the group. We can write this 


dg 
1 
U@)=UO)+ >> OTs + 5 PanTaTs +: +O’) 
A=1 hi 
dg 1 
3 
=I1+5_ 9aTa + 5 OaOnTaTs + +--+ O0), (8.27) 
A=1 E 


where 


(8.28) 


T4 is a matrix generator for the group. 


8.3.3 Proper group elements 


All infinitesimal group elements can be parametrized in terms of linear com- 
binations of generators T4; thus, it is normal for group transformations to be 
discussed in terms of infinitesimal transformations. In terms of the geometrical 
analogy, infinitesimal group elements are those which are very close to the 
identity. They are defined by taking only terms to first order in 0 in the sum 
in eqn. (8.27). The coefficients 04 are assumed to be infinitesimally small, so 
that all higher powers are negligible. This is expressed by writing 


U (80) = U (0) + ô0AT4, (8.29) 


with an implicit summation over A. With infinitesimal transformations, one 
does not get very far from the origin; however, the rule of group composition 
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may be used to build (almost) arbitrary elements of the group by repeated 
application of infinitesimal elements. This is analogous to adding up many 
infinitesimal vectors to arrive at any point in a vector space. 

We can check the consistency of repeatedly adding up N group elements by 
writing 504 = 64/N, combining U (0) = U (80)™ and letting N —> oo. In this 
limit, we recover the exact result: 


— 4: 04 abata 


which is consistent with the series in eqn. (8.27). Notice that the finite group 
element is the exponential of the infinitesimal combination of the generators. It 
is often stated that we obtain a group by exponentiation of the generators. 

It will prove significant to pay attention to another form of this exponentiation 
in passing. Eqn. (8.30) may also be written 


6 
U (0) = exp Gi Tad) y (8.31) 
0 
From this we note that 
oU (0 
© _ ir, Ue), (8.32) 
004 
and hence 
dU 
FI =i7,d0 =T. (8.33) 


This quantity, which we shall often label I in future, is an infinitesimal linear 
combination of the generators of the group. Because of the exponential form, it 
can also be written as a differential change in the group element U (0) divided 
by the value of U(@) at that point. This quantity has a special significance in 
geometry and field theory, and turns up repeatedly in the guise of gauge fields 
and ‘connections’. 

Not all elements of a group can necessarily be generated by combining 
infinitesimal elements of the group. In general, it is only a sub-group known 
as the proper group which can be generated in this way. Some transformations, 
such as reflections in the origin or coordinate reversals with respect to a 
group parameter are, by nature, discrete and discontinuous. A reflection is 
an all-or-nothing transformation; it cannot be broken down into smaller pieces. 
Groups which contain these so-called /arge transformations are expressible as a 
direct product of a connected, continuous group and a discrete group. 
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8.3.4 Conjugate representations 


Given a set of infinitesimal generators, T4, one can generate infinitely many 
more by similarity transformations: 


| eu sO hea ec (8.34) 


This has the effect of generating an equivalent representation. Any two 
representations which are related by such a similarity transformation are said 
to be conjugate to one another, or to lie in the same conjugacy class. Conjugate 
representations all have the same dimension dr. 


8.3.5 Congruent representations 


Representations of different dimension dr also fall into classes. Generators 
which exponentiate to a given group may be classified by congruency class. All 
group generators with different dg exponentiate to groups which are congruent, 
modulo their centres, i.e. those which are the same up to some multiple covering. 
Put another way, the groups formed by exponentiation of generators of different 
dr are identical only if one factors out their centres. 

A given matrix representation of a group is not necessarily a one-to-one 
mapping from algebra to group, but might cover all of the elements of a group 
one, twice, or any integer number of times and still satisfy all of the group 
properties. Such representations are said to be multiple coverings. 

A representation Ur and another representation Up: lie in different congru- 
ence Classes if they cover the elements of the group a different number of times. 
Congruence is a property of discrete tiling systems and is related to the ability 
to lay one pattern on top of another such that they match. It is the properties of 
the generators which are responsible for congruence [124]. 


8.4 Reducible and irreducible representations 


There is an infinite number of ways to represent the properties of a given group 
on a representation space. A representation space is usually based on some 
physical criteria; for instance, to represent the symmetry of three quarks, one 
uses a three-dimensional representation of SU (3), although the group itself is 
eight-dimensional. It is important to realize that, if one chooses a large enough 
representation space, the space itself might have more symmetry than the group 
which one is using to describe a particular transformation. Of the infinity 
of possible representations, some can be broken down into simpler structures 
which represent truly invariant properties of the representation space. 
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8.4.1 Invariant sub-spaces 


Suppose we have a representation of a group in terms of matrices and vectors; 
take as an example the two-dimensional rotation group SO(2), with the repre- 
sentation 


cos@sin@ 
es ( —sin@ cosé ). (8.35) 


so that the rotation of a vector by an angle 0 is accomplished by matrix 


multiplication: 
x, \_ ( cosé sind xı 
P )=( —sin@ cos ie J (8.36) 


It is always possible to find higher-dimensional representations of the same 
group by simply embedding such a group in a larger space. If we add an extra 
dimension x3, then the same rotation is accomplished, since x; and x2 are altered 
in exactly the same way: 


x} cos@3; sinh, O0 x1 
x, |= | —sinO; sind; 0 x |. (8.37) 
x (0) 0 1 X3 


This makes sense: it is easy to make a two-dimensional rotation in a three- 
dimensional space, and the same generalization carries through for any number 
of extra dimensions. The matrix representation of the transformation has zeros 
and a diagonal 1, indicating that nothing at all happens to the x3 coordinate. It 
is irrelevant or ignorable: 


cos@3; sin@3; O0 
U =| -sin sind; 0 |. (8.38) 
0 0 1 


A six-dimensional representation would look like this: 


Xi cos63; sindz 0 000 xı 

x3 —sin@; sing; 0 0 0 0 x 

x} 0 0 1000 x3 

el | > Oy a ||| aa ee) 
x; 0 0 0010 X5 

0 0 0001 X6 

X6 


The matrix has a block-diagonal form. These higher-dimensional represen- 
tations are said to be reducible, since they contain invariant sub-spaces, or 
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coordinates which remain unaltered by the group. In the six-dimensional case 
above, the 6 x 6 matrix factorizes into a direct sum of block-diagonal pieces: a 
2 x 2 piece, which is the actual SO (2) part, and a trivial four-dimensional group 
composed of only the identity 74. The direct sum is written 


SO(2)6 = SO(2)2 @ l4. (8.40) 


When a matrix has the form of eqn. (8.39), or is related to such a form by a 
similarity transformation 


A UA, (8.41) 


it is said to be a completely reducible representation of the group. In block- 
diagonal form, each block is said to be an irreducible representation of the 
group. The smallest representation with all of the properties of the group 
intact is called the fundamental representation. A representation composed 
of dg x dg matrices, where dg is the dimension of the group, is called the 
adjoint representation. In the case of SO(3), the fundamental and adjoint 
representations coincide; usually they do not. 

Whilst the above observation might seem rather obvious, it is perhaps less 
obvious if we turn the argument around. Suppose we start with a 6 x 6 matrix 
parametrized in terms of some group variables, 04, and we want to know which 
group it is a representation of. The first guess might be that it is an irreducible 
representation of O(6), but if we can find a linear transformation A which 
changes that matrix into a block-diagonal form with smaller blocks, and zeros 
off the diagonal, then it becomes clear that it is really a reducible representation, 
composed of several sub-spaces, each of which is invariant under a smaller 


group. 


8.4.2 Reducibility 


The existence of an invariant sub-space S in the representation space R implies 
that the matrix representation Gr is reducible. Suppose we have a representation 
space with a sub-space which is unaffected by the action of the group. By 
choosing coordinates we can write a group transformation g as 


Xr A(g) B(8) Xp 
7 = ; 8.42 
(HP eo) is 
which shows that the coordinates X 5 belonging to the sub-space are independent 


of the remaining coordinates Xz. Thus no matter how Xp are transformed, X s 
will be independent of this. The converse is not necessarily true, but often is. 


Our representation, 
dea Ae 2s (8.43) 
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satisfies the group composition law; thus, 


Ur(81) Ur(g2) = ( aes ) ( R ) 


( A(81) A(82) A(81)B(82) + B(81)C (82) ) 
= (8.44) 


0 C(g1)C (82) 
Comparing this with the form which a true group representation would have: 
A(g1 +82) B(gi- 82) 
, 8.45 
( 0 C(g1 - 82) re 


one sees that A and C also form representations of the group, of smaller size. 
B does not, however, and its value is constrained by the condition B(g; - g2) = 
A(g1)B(g2) + B(g1)C(g2). A representation of this form is said to be partially 
reducible. 

If B = 0 in the above, then the two sub-spaces decouple: both are invariant 
under transformations which affect the other. The representation is then said 
to be completely reducible and takes the block-diagonal form mentioned in the 
previous section. 


8.5 Lie groups and Lie algebras 


Groups whose elements do not commute are called non-Abelian. The com- 
mutativity or non-commutativity of the group elements U (0) follows from 
the commutation properties of the generators Tą, as may be seen by writing 
the exponentiation operation as a power series. In a non-Abelian group the 
commutation relations between generators may be written in this form: 


[Ta, Ty] = Cap. (8.46) 


A special class of groups which is interesting in physics is the Lie groups, which 
satisfy the special algebra, 


[Ta, Th] = —i fay Te. (8.47) 


fa 18 a set of structure constants, and all the labels a, b, c run over the group 
indices from 1,...,dg. Eqn. (8.47) is called a Lie algebra. It implies that the 
matrices which generate a Lie group are not arbitrary; they are constrained to 
satisfy the algebra relation. The matrices satisfy the algebraic Jacobi identity 


[T*, 17’, TU+(7?, (T°, TN +7, (77, T’]] = 0. (8.48) 


Many of the issues connected to Lie algebras are analogous to those of the 
groups they generate. We study them precisely because they provide a deeper 
level of understanding of groups. One also refers to representations, equivalence 
classes, conjugacy for algebras. 
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8.5.1 Normalization of the generators 


The structure of the dr x dr dimensional matrices and the constants fabe which 
make up the algebra relation are determined by the algebra relation, but the 
normalization is not. If we multiply Tz and fabe by any constant factor, the 
algebra relation will still be true. The normalization of the generators is fixed 
here by relating the trace of a product of generators to the quadratic Casimir 
invariant: 


Tr(TgTp) = b(Gr)d”, (8.49) 


where D is called the Dynkin index for the representation Gr. The Dynkin 
index may also be written as 


d 
D(Gr) = 4, C28) (8.50) 


where dp is the dimension (number of rows/columns) of the generators in 
the representation Gr, and dg is the dimension of the group. C2(Gp) is the 
quadratic Casimir invariant for the group in the representation, Gr: C2(Gr) 
and JI5(Gpr) are constants which are listed in tables for various representations 
of Lie groups [96]. dg is the same as the dimension of the adjoint representation 
of the algebra Gaj, by definition of the adjoint representation. Note, therefore, 
that 1y(Gaagj) = C2(Gaaj)- 

The normalization is not completely fixed by these conditions, since one 
does not know the value of the Casimir invariant a priori. Moreover, Casimir 
invariants are often defined with inconsistent normalizations, since their main 
property of interest is their ability to commute with other generators, rather 
than their absolute magnitude. The above relations make the Casimir invariants 
consistent with the generator products. To complete the normalization, it is usual 
to define the length of the longest roots or eigenvalues of the Lie algebra as 2. 
This fixes the value of the Casimir invariants and thus fixes the remaining values. 
For most purposes, the normalization is not very important as long as one is 
consistent, and most answers can simply be expressed in terms of the arbitrary 
value of C2(Gr). Thus, during the course of an analysis, one should not be 
surprised to find generators and Casimir invariants changing in definition and 
normalization several times. What is important is that, when comparisons are 
made between similar things, one uses consistent conventions of normalization 
and definition. 


8.5.2 Adjoint transformations and unitarity 


A Lie algebra is formed from the dg matrices T“ which generate a Lie group. 
These matrices are dr x dr matrices which act on the vector space, which 
has been denoted representation space. In addition, the dg generators which 
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fulfil the algebra condition form a basis which spans the group space. Since 
the group is formed from the algebra by exponentiation, both a Lie algebra A 
and its group G live on the vector space referred to as group space. In the case 
of the adjoint representation Gr = Gagj, the group and representation spaces 
coincide (dg = dr, a,b,c <> A, B,C). The adjoint representation is a direct 
one-to-one mapping of the algebra properties into a set of matrices. It is easy to 
show that the structure constants themselves form a representation of the group 
which is adjoint. This follows from the Jacobi identity in eqn. (8.48). Applying 
the algebra relation (8.47) to eqn. (8.48), we have 


[T?, —i coal ll ate [T?, —i f TY a (7; eT aad! Gall — (0. (8.51) 


Using it again results in 


(pore — ferd pbde _ pse T° =0. (8.52) 
Then, from the coefficient of T°, making the identification, 
[Tec = ifsc (8.53) 
it is straightforward to show that one recovers 
(er? a peer, (8.54) 


Thus, the components of the structure constants are the components of the 
matrices in the adjoint representation of the algebra. The representation is 
uniquely identified as the adjoint since all indices on the structure constants 
run over the dimension of the group a,b = 1,..., dg. 

The group space to which we have been alluding is assumed, in field 
theory, to be a Hilbert space, or a vector space with a positive definite metric. 
Representation space does not require a positive definite metric, and indeed, in 
the case of groups like the Lorentz group of spacetime symmetries, the metric 
in representation space is indefinite. The link between representation space and 
group space is made by the adjoint representation, and it will prove essential 
later to understand what this connection is. 

Adjoint transformations can be understood in several ways. Suppose we take 
a group vector v? which transforms by the rule 

CSUs, (8.55) 
where 
Uaj = exp (a Ta) ; (8.56) 


It is also possible to represent the same transformation using a complete set of 
arbitrary matrices to form a basis for the group space. For the matrices we shall 
choose the generators Tp, is an arbitrary representation 


Ve = v’ TE. (8.57) 
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If we assume that the v^ in eqns. (8.55) and (8.57) are the same components, 
then it follows that the transformation rule for Vg must be written 


Vi = vT = UR Vr UR, (8.58) 
where 
Ur = exp (i0° T$) ; (8.59) 


This now has the appearance of a similarity transformation on the group space. 
To prove this, we shall begin with the assumption that the field transforms as in 
eqn. (8.58). Then, using the matrix identity 


1 
exp(A)B exp(—A) = B +[A, B] + zlé [A, B]] + 
1 
T [A,[A, B]]]+---, (8.60) 
it is straightforward to show that 
1 a j ça ab l ca cbs 
Val = 08, — OF? + 5010. Ff + 
1 


3! 


where the algebra commutation relation has been used. In our notation, the 
generators of the adjoint representation may written 


On004 JESAS to kra (8.61) 


( per 


dra (8.62) 
and the structure constants are real. Eqn. (8.61) may therefore be identified as 
Vr' = v“ (Ua), TR, (8.63) 
where 
Uaj = exp(i0“T,4;). (8.64) 
If we now define the components of the transformed field by 
Ve Su Te, (8.65) 
in terms of the original generators, then it follows that 
v = (Uaj) o”. (8.66) 


We can now think of the set of components, v? and v’“, as being grouped into 
dg-component column vectors v and v’, so that 


v= UaajV- (8.67) 
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Thus, we see that the components of a group vector, vf, always transform 
according to the adjoint representation, regardless of what type of basis we 
use to represent them. To understand the significance of this transformation 
rule, we should compare it with the corresponding tensor transformation rule in 
representation space. If we use the matrix 


Ur = [Ur] (8.68) 


where A,B = 1,...,dp, as a transformation of some representation space 
vector ¢“ or tensor [Vel‘s then, by considering the invariant product 


b' Vro > (Up)! UVrU™! (Ug), (8.69) 
we find that the transformation rule is the usual one for tensors: 


p =uU4,¢? (8.70a) 
Vag = UŻU}, Von. (8.70b) 


The transformation rule (8.58) agrees with the rule in eqn. (8.70b) provided 
U=U. (8.71) 


This is the unitary property, and it is secured in field theory also by the use 
of a Hilbert space as the group manifold. Thus, the form of the adjoint 
transformation represents unitarity in the field theory, regardless of the fact that 
the indices A, B might have an indefinite metric. 

The object Ve, which transforms like U~'V U, signifies a change in the 
disposition of the system. This form is very commonly seen; for example, in 
dynamical changes: 


ð p > ð (Up) = (0,U)6 + UO) 
=U(d,+T,)¢ (8.72) 


where 
T, = O20. (8.73) 


This object is usually called a ‘connection’, but, in this context, it can be viewed 
as an expression of a change in the dynamical configuration, of the internal 
constraints on a system. In the following two chapters, we shall see examples of 
these transformations, when looking at the Lorentz group and gauge symmetries 
in particular. 
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8.5.3 Casimir invariants 


From the Lie algebra relation in eqn. (8.47), it is straightforward to show that 
the quadratic sum of the generators commutes with each individual generator: 


Ie cS T’T’] = Ri T’T’? — T’T’ T° 
=T" TPT oa: T° (TIT? + if™ T°) 
= IT", T?] T° = pee’ TTS 
=-j fave (eer? 4 or | 
= 0. (8.74) 
The last line follows since the bracket is a symmetric matrix, whereas the 
structure constants are anti-symmetric. In fact, the quadratic sum of the 


generators is proportional to the identity matrix. This follows also from Schur’s 
lemma: 


1 
T T" = — C2(Gr) Ir, (8.75) 
dg 
or 


1 
E E Ca(Gaaj)ô t. (8.76) 


8.5.4 Sub-algebra 


Just as groups have sub-groups, algebras have sub-algebras. A sub-set, H, of an 
algebra, A, is called a linear sub-algebra of A if H is a linear sub-space of the 
group space and is closed with respect to the algebra relation. i.e. for any matrix 
elements of the sub-algebra h,, h2 and h3, one has 


[t1, 2] = —i fi t- (8.77) 
This is a non-Abelian sub-algebra. Sub-algebras can also be Abelian: 


[hy, h2] = 0. (8.78) 


8.5.5 The Cartan sub-algebra 


The Cartan sub-algebra is an invariant sub-algebra whose elements generate the 
centre of a Lie group when exponentiated. This sub-algebra has a number of 
extremely important properties because many properties of the group can be 
deduced directly from the sub-set of generators which lies in the Cartan sub- 
algebra. 
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The generators of the Cartan sub-algebra commute with one another but not 
necessarily with other generators of the group. Since the Cartan sub-algebra 
generates the centre of the group (the maximal Abelian sub-group) under 
exponentiation, Schur’s lemma tells us that the group elements found from these 
are diagonal and proportional to the identity matrix. The Cartan sub-algebra is 
the sub-set the group generators T“ which are simultaneously diagonalizable in 
a Suitable basis. In other words, if there is a basis in which one of the generators, 
T“, is diagonal, then, in general, several of the generators will be diagonal in the 
same basis. One can begin with a set of generators, T$, in a representation, Gr, 
and attempt to diagonalize one of them using a similarity transformation: 


T” > ATRA. (8.79) 


The same transformation, A, will transform a fixed number of the matrices into 
diagonal form. This number is always the same, and it is called the rank of 
the group or rank(G). The diagonalizable generators are denoted H', where 
i = 1,...,rank(G). These form the Cartan sub-algebra. Note that, in the case 
of the fundamental representation of SU (2), the third Pauli matrix is already 
diagonal. This matrix is the generator of the Cartan sub-algebra for SU (2) in 
the dr = 2 representation. Since only one of the generators is diagonal, one 
concludes that the rank of SU (2) is 1. 


8.5.6 Example of diagonalization 


The simplest example of a Cartan sub-algebra may be found in the generators 
of the group SO (3) in the fundamental representation, or identically of SU (2) 
in the adjoint representation. These matrices are well known as the generators 
of rotations in three dimensions, and are written: 


00 0 
T'={] 00 -i 
O i 0 
0 0 i 
T? = 0 00 
—i 0 0 
0 —i 0 
T?=|i 0 0 (8.80) 
0 0 0 


To find a basis which diagonalizes one of these generators, we pick T! to 
diagonalize, arbitrarily. The self-inverse matrix of eigenvectors for T! is easily 
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found. It is given by 


-1 0 0 
1 —i 
a=| 0 5 ZF I. (8.81) 
i —1 
A 


Constructing the matrices A~'T“ A, one finds a new set of generators, 


00 0 
T'=!01 0 
00 -1 
an (eae 
VZN i 0 0 
1/00 il 
T? = — | -i 0 (8.82) 


< 
7 
(ap) 
© 


Since only one of these is diagonal, rank rank SU (2) = 1. Equally, we could 
have chosen to diagonalize a different generator. This would then have had 
the same eigenvalues, and it would have been the generator of the Cartan sub- 
algebra in an alternative basis. None of the generators are specially singled out 
to generate the sub-algebra. The diagonalizability is an intrinsic property of the 
algebra. 


8.5.7 Roots and weights 


The roots and weights of algebra representations are proportional to eigenvalues 
of the Cartan sub-algebra generators for different dg. The roots are denoted a4 
and the weights are denoted A“. Because the algebra relation ensures exactly dg 
independent vectors on the group space, there are dg independent eigenvalues 
to be found from the generators.'! We shall explore the significance of these 
eigenvalues in the next section. 


l This might seem confusing. If one has rank(G) simultaneously diagonalizable dp x dr 
matrices, then it seems as though there should be dr x rank(G) eigenvalues to discern. The 
reason why this is not the case is that not all of the generators are independent. They are 
constrained by the algebra relation. The generators are linearly independent but constrained 
through the quadratic commutator condition 
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For generators of the Cartan sub-algebra, H}, in a representation Gp, the 
weights are eigenvalues: 


Hi = ri (8.83) 


ia = | aj | ; (8.84) 


The significance of the adjoint representation is that it is a direct one-to-one 
mapping of intrinsic algebra properties. The roots have a special significance 
too: the algebra can be defined purely in terms of its roots. The diagonal basis 
we have referred to above is a step towards showing this, but to see the true 
significance of the root and weights of an algebra, we need to perform another 
linear transformation and construct the Cartan—Wey] basis. 


8.5.8 The Cartan—Weyl basis 


The Cartan-Weyl basis is one of several bases in which the generators of 
the Cartan sub-algebra are diagonal matrices. To construct this basis we can 
begin from the diagonal basis, found in the previous section, and form linear 
combinations of the remaining non-diagonal generators. The motivation for this 
requires a brief theoretical diversion. 

Suppose that © and © are arbitrary linear combinations of the generators of a 
Lie algebra. This would be the case if © and ® were non-Abelian gauge fields, 
for instance 


0©=6,T% 
= ¢,T%, (8.85) 
where a = 1,..., dg. Then, consider the commutator eigenvalue equation 
[O, 9] =a, (8.86) 


where g is an eigenvalue for the ‘eigenvector’ ®. If we write this in component 
form, using the algebra relation in eqn. (8.47), we have 


O°” farcT® =a GT". (8.87) 
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Now, since the T^ are linearly independent we can compare the coefficients of 
the generators on the left and right hand sides: 


($4 fF, — 08,” = 0. (8.88) 


This equation has non-trivial solutions if the determinant of the bracket vanishes, 
and thus we require 


det |ø" ff — a 4,°| = 0. (8.89) 


For a dg dimensional Lie algebra this equation cannot have more than dg 
independent roots, œ. Cartan showed that if one chooses © so that the secular 
equation has the maximum number of different eigenvalues or roots, then only 
zero roots œ = 0 can be degenerate (repeated). If œ~ = 0 is r-fold degenerate, 
then r is the rank of the semi-simple Lie algebra. 

The generators associated with zero eigenvalues are denoted H', where i = 
1,...,rank(G) and they satisfy 


(o/H/, H] = 0, (8.90) 


i.e. they commute with one another. The remaining generators, which they do 
not commute with are written E«, for some non-zero œ, and they clearly satisfy 


[0I Hİ, Ex] = & Ex. (8.91) 


We can think of the roots or eigenvalues as vectors living on the invariant sub- 
space spanned by the generators H’. The components can be found by allowing 
the H’ to act on the Ex. Consider 


[0 H!, (Hj, Eal] = [0 H}, H; Ex] — [0] H} , Ex H;] 
=a[H', Ea]. (8.92) 


This result can be interpreted as follows. If E« is an ‘eigenvector’ associated 
with the eigenvalue a, then there are rank(G) eigenvectors [H', E,] belonging 
to the same eigenvalue. The eigenvectors must therefore each be proportional to 
Eys 


[H', Ea] =a! Eq, (8.93) 
and the components of the vector are defined by 
a =a 0. (8.94) 


This relation defines the components of a root vector on the invariant Cartan 
sub-space. Comparing eqn. (8.93) with the algebra relation in eqn. (8.47), 


fa = aik. (8.95) 
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Finally, by looking at the Jacobi identity, 
[O, [Eu, Egl] + [Ew [Eg, Ol] + [Eg,[0, Eal] = 9, (8.96) 
we find that 
[O, [Eu, Egl] = (a + B)[Eo, Eg]. (8.97) 


This means that [E,, Eg] is the eigenvector associated with the root (a + £), 
provided that œ + 6 ~ 0. If a + B = O then, since the zero eigenvalues are 
associated with H‘, we must have 


[Ea, E-a] = fy a Hi 
=a; Hİ. (8.98) 
This shows how the FE, act as stepping operators, adding together solutions to 


the eigenvalue equation. It also implies that if there is a zero root, then there 
must be pairs of roots a, —a. In summary, 


[H', Ey] = a Eq 
[Eq, Ea] = a! Hi; 


What is the physical meaning of the root vectors? The eigenvalue equation is 
an equation which tells us how many ways one generator of transformations 
maps to itself, up to a scalar multiple under the action of the group. The 
H are invariant sub-spaces of a symmetry group because they only change 
the magnitude of a symmetry state, not its character. In other words, the 
Cartan sub-algebra represents the number of simultaneous labels which can be 
measured or associated with a symmetry constraint. Labels represent physical 
properties like spin, momentum, energy, etc. The stepping operators for a given 
representation of the group determine how many independent values of those 
labels can exist based on symmetry constraints. This is the number of weights in 
a stepping chain. In the case of rotations, the root/weight eigenvalues represent 
the spin characteristics of particles. A system with one pair of weights (one 
property: rotation about a fixed axis) in a dr = 2 representation can only be in 
a spin up or spin down state because there are only two elements in the stepping 
chain. A dr = 3 representation has three elements, so the particle can have spin 
up down or zero etc. 

The Chevalley normalization of generators is generally chosen so as to make 
the magnitude of the longest root vectors equal to (a, a) = Vata? = 2. 
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8.5.9 Group vectors and Dirac notation 


In quantum mechanics, Dirac introduced a notation for the eigenvectors of an 
operator using bra and ket notation. Dirac’s notation was meant to emphasize 
the role of eigenvectors as projection operators which span a vector space. 
Dirac’s notation is convenient since it is fairly intuitive and is widely used in 
the physics literature. An eigenvector is characterized by a number of labels, 
i.e. the eigenvalues of the various operators which share it as an eigenvector. 


If we label these eigenvalues a, 8,... and so on, then we may designate the 
eigenvectors using a field or eigenfunction 
Wai, Bjs. (8.99) 


or in Dirac notation as a ket: 
læi, Bjs.) (8.100) 


Notice that, in Dirac’s notation, the redundant symbol y is removed, which 
helps to focus one’s attention on the relevant labels: the eigenvalues themselves. 
The operators which have these eigenfunctions as simultaneous eigenvectors 
then produce: 


Ai Wa;,p;,... = Qi Waj.B;.... 
Bj Wa;.p;,... = Bj Vai Bj (i, j not summed), (8.101) 
or, equivalently, 
A; |a;, Bj,...) =a; | aj, Bj, ...) 
B;|a;, Bj,...) = Bj lai, Bj, .-.) (i, j not summed). (8.102) 
In most physical problems we are interested in group spaces with a positive 


definite metric, i.e. Hilbert spaces. In that case, the dual vectors are written as a 
Hermitian conjugate: 


AA (8.103) 
or in Dirac notation as a bra: 
(a, B,...|. (8.104) 
The length of a vector is then given by the inner product 
lai, Bilar, Bi) = Y, p; Vornpi = 5:5): X length. (8.105) 


The eigenvectors with different eigenvalues are orthogonal and usually normal- 
ized to unit length. 

The existence of simultaneous eigenvalues depends on the existence of 
commuting operators. Operators which do not commute, such as x', p/ and 
group generators, T“, T’, can be assigned eigenvectors, but they are not all 
linearly independent; they have a projection which is a particular group element: 


(xip) = ír, (8.106) 
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8.5.10 Example: rotational eigenvalues in three dimensions 


In this section, we take a first look at the rotation problem. We shall return 
to this problem in chapter 11 in connection with angular momentum and 
spin. The generators of three-dimensional rotations are those of SO(3), or 
equivalently su(2) in the adjoint representation. The generators are already 
listed in eqns. (8.80). We define 


T*=T'T? 
E+ = h F iT 
H=. (8.107) 


In this new basis, the generators satisfy the relation 


[H, Ex] = € Ex. (8.108) 


The stepping operators are Hermitian conjugates: 
Fİ =E. (8.109) 


The generator H labels a central generator, or invariant sub-space, and cor- 
responds to the fact that we are considering a special axis of rotation. The 
eigenvalues of the central generator H are called its weights and are labelled 
Ac 


H |Ac) = Acļ Ao). (8.110) 


|A.) is an eigenvector of H with eigenvalue A,. The value of the quadratic 
form, T?, is also interesting because it commutes with H and therefore has its 
own eigenvalue when acting on H’s eigenfunctions, which is independent of c. 
It can be evaluated by expressing T? in terms of the generators in the new basis: 


E,E_ = tT + T a ilh, T3] 
E_E; = T} + T? + ilh, Tl, (8.111) 


so that, rearranging and using the algebra relation, 


T? = E_E, + T? — ilh, T3] 
= E_E, + Tf -i(-iT)) 
= E_E, + H(H +1), (8.112) 


where we have identified 7; = H in the last line. By the analogous procedure 
with + labels reversed, we also find 


T? = E,E.+ H(H — 1). (8.113) 
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These forms allow us to evaluate the eigenvalues of T* for two of the eigen- 
functions in the full series. To understand this, we note that the effect of the E+ 
generators is to generate new solutions step-wise, i.e. starting with an arbitrary 
eigenfunction |^.) they generate new eigenfunctions with new eigenvalues. 
This is easily confirmed from the commutation relation in eqn. (8.108), if 
we consider the ‘new’ eigenvector Ei|A.) from |A,) and try to calculate the 
corresponding eigenvalue: 


H Ex |A<) = (EH + [H, T4) |A<) 
= (ExH + Ex) |A) 
= (A. +1) Ex|A,). (8.114) 


We see that, given any initial eigenfunction of H, the action of E4 is to produce 
a new eigenfunction with a new eigenvalue, which differs by +1 from the 
original, up to a possible normalization constant which would cancel out of this 
expression: 


Ex |A.) X |A + 1). (8.115) 


Now, the number of solutions cannot be infinite because the Schwarz (triangle) 
inequality tells us that the eigenvalue of T? (whose value is not fixed by the 
eigenvalue of H, since T? and T“ commute) must be bigger than any of the 
individual eigenvalues T°: 


(A |E,E_ + E-E} + H?|A,) > (A,|H*|A¢), (8.116) 


so the value of H acting on |A,) must approach a maximum as it approaches 
the value of T? acting on |A,). Physically, the maximum value occurs when 
all of the rotation is about the a = 1 axis corresponding to our chosen Cartan 
sub-algebra generator, Ti = H. 

In other words, there is a highest value, Amax, and a lowest eigenvalue, Amin. 
Now eqns. (8.112) and (8.113) are written in such a way that the first terms 
contain E+, ready to act on any eigenfunction, so, since there is a highest and 
lowest eigenvalue, we must have 


E4 |Amax) = 0 
E_|Amin) = 0. (8.117) 
Thus, 
T? | Kear) = Amax(Amax + 1) |Amax), (8.118) 
and 


T | Amin) = Amin (Amin — 1) |Amin). (8.119) 


8.5 Lie groups and Lie algebras 197 


From these two points of reference, we deduce that 
Amax(Amax + 1) = Amin(Amin =: (8.120) 


This equation has two solutions, Amin = Amax + 1 (which cannot exist, since 
there is no solution higher than A max by assumption), and 


Amax = — Amin, (8.121) 
thus 
T? = Amax(Amax + 1) L. (8.122) 


The result means that the value T? is fixed by the maximum value which H can 
acquire. Strangely, the value is not AŽ, (all rotation about the 1 axis), which 
one would expect from the behaviour of the rotation group. This has important 
implications for quantum mechanics, since it is the algebra which is important 
for angular momentum or spin. It means that the total angular momentum can 
never be all in one fixed direction. As Amax —> oo the difference becomes 
negligible. 

The constant of proportionality in eqn. (8.115) can now be determined from 
the Hermitian property of the stepping operators as follows. The squared norm 
of E+|A.) may be written using eqn. (8.112) 


[E4|Ac) |? = (Ac|E-E4|Ac) 
= (A.|T* — H(H + 1)|Ac) 
= Amax(Amax + 1) — Ac(Ac + 1) 
= (Amax — Ac)(Amax + Ac + 1). (8.123) 


Thus, 


Pin) = V Oe — Ac) (Ama + Ac + Ae + 1) 
E_|Ac) = y (Amar + Ac)(Amax — Ac + D|Ac — 1). (8.124) 


Eqn. (8.121), taken together with eqn. (8.114), implies that the eigenvalues are 
distributed symmetrically about A, = 0 and that they are separated by integer 
steps. This means that the possible values are restricted to 


1 3 
Ac = 0,45, 41,45, 42,..., + Amn (8.125) 


There are clearly 2Amax + 1 possible solutions. In the study of angular 
momentum, Amax, is called the spin up to dimensional factors (fi). In group 
theory, this is referred to as the highest weight of the representation. Clearly, 
this single value characterizes a key property of the representation. 
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What the above argument does not tell us is the value of Amax. That is 
determined by the dimension of the irreducible representation which gives rise 
to rotations. In field theory the value of A max depends, in practice, on the number 
of spacetime indices on field variables. Since the matrices for rotation in three 
spatial dimensions are fixed by the spacetime dimension itself, the only freedom 
left in transformation properties under rotations is the number of spacetime 
indices which can be operated on by a rotational transformation matrix. A 
scalar (no indices) requires no rotations matrix, a vector (one index) requires 
one, a rank 2-tensor requires two and so on. The number of independently 
transforming components in the field becomes essentially blocks of 2A max + 1 
and defines the spin of the fields. 


8.6 Examples of discrete and continuous groups 


Some groups are important because they arise in field theory with predictable 
regularity; others are important because they demonstrate key principles with a 
special clarity. 


8.6.1 GL(N, C): the general linear group 


The group of all complex N x N, non-singular matrices forms a group. This 
group has many sub-groups which are important in physics. Almost all physical 
models can be expressed in terms of variables which transform as sub-groups of 
this group. 


(1) Matrix multiplication combines non-singular matrices into new non- 
singular matrices. 


(2) Matrix multiplication is associative. 


(3) The identity is the unit matrix 


(8.126) 


oo OF 
= 
re Oo OO 


1 
0 
(4) Every non-singular matrix has an inverse, by definition. 


The representation space of a collection of matrices is the vector space on which 
the components of those matrices is defined. Since matrices normally multiply 
vectors, mapping one vector, v“, onto another vector, v4, 


va > v^ = Uag vë, (8.127) 
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it is normal to think of these matrices as acting on group vectors. In field 
theory, these transformations are especially important since the group vectors 
are multiplets of fields, e.g. 


P(x) 


p(x) 


(x) = ; (8.128) 


bag (*) 


where dg is the dimension of the representation, or the size of the dg x dp 
matrices. Note: the dimension of a representation (the number of components 
in a multiplet) is not necessarily the same as the dimension of the group itself. 
For example: a three-dimensional vector (dg = 3) might be constrained, 
by some additional considerations, to have only an axial symmetry (group 
dimension dg = 1, a single angle of rotation); in that case one requires a 3 x 3 
representation of a one-dimensional group, since vectors in three dimensions 
have three components. 


8.6.2 U(N): unitary matrices 


U (N) is the set of all unitary matrices of matrix dimension N. An N x N unitary 
matrix satisfies 


UU = (UD* U = 1, (8.129) 


where 7 is the N x N unit matrix, i.e. U = U~'. When n = 1, the matrices 
are single-component numbers. An N x N matrix contains N? components; 
however, since the transpose matrix is related to the untransposed matrix by 
eqn. (8.129), only half of the off-diagonal elements are independent of one 
another. Moreover, the diagonal elements must be real in order to satisfy the 
condition. This means that the number of independent real elements in a unitary 
matrix is (N? — N)/2 complex plus N real means N? real numbers. This is 
called the dimension of the group. U (N) is non-Abelian for U > 1. 


8.6.3 SU(N): the special unitary group 


The special unitary group is the sub-group of U (N) which consists of all unitary 
matrices with unit determinant. Since the requirement of unit determinant is an 
extra constraint on the all of the independent elements of the group (i.e. the 
product of the eigenvalues), this reduces the number of independent elements 
by one compared with U (N). Thus the dimension of SU(N) is N? — 1 real 
components. SU(N) is non-Abelian for N > 1. SU(N) has several simple 
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properties: 
C2(Gagj) = N 
dg = N*-1 
dp = N 
N?-1 
CAG) = ay (8.130) 


where C2(G) is the quadratic Casimir invariant in representation G, dg is 
the dimension of the group, and dp is the dimension of the fundamental 
representation R > F. 


8.6.4 SU (2) 


The set of 2 x 2 unitary matrices with unit determinant has N? — 1 = 3 elements 
for n = 2. Up to similarity transformations, these may be written in terms of 
three real parameters (61, 62, 02): 


s= (8.131a) 


6 in (46 
g= ( oak 2). on : 2) ) (8.131b) 
7 


“1 
e'260 0 
oe ( 3 = | (8.131c) 


These matrices are the exponentiated Pauli matrices e2%. Using this basis, 
any element of the group may be written as a product of one or more of these 
matrices with some 6;. 


8.6.5 U(1): the set of numbers z : |z|? = 1 


The set of all complex numbers U = ef? with unit modulus forms an Abelian 
group under multiplication: 


(1) eil ei — eit) 
(2) (ei ei) e183 — e181 (ei ei), 
(3) e el = ei? 


(4) UT! = U* since e” e? — e = |, 
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The representation space of this group is the space of complex scalars ®, with 
constant modulus: 


b*d > (Ub)*US = O* U*U & = * Ò. (8.132) 


This group is important in electromagnetism; it is this symmetry group of 
complex phases which is connected to the existence of a conserved electrical 
charge. 


8.6.6 Zy: the Nth roots of unity 


The Nth roots of unity form a sub-group of U (1). These complex numbers may 
be written in the form exp(27i+), for p = 0,..., N — 1. The group Zy is 
special because it is not infinite. It has exactly N discrete elements. The group 
has the topology of a circle, and the elements may be drawn as equi-distant 
points on the circumference of the unit circle in the complex plane. Zy is a 
modulo group. Its elements satisfy modulo N arithmetic by virtue of the multi- 
valuedness of the complex exponential. The group axioms are thus satisfied as 
follows: 


(1) exp (277i 2) exp (2ni x) = exp (27 viv’) = exp (27i E + m|), 
where N, m, p are integers; 


(2) follows trivially from U (1); 
(3) follows trivially from U (1); 


(4) the inverse exists because of the multi-valued property that 


. p .N=p 
-2ri£) = exp (2 . 8.133 
exp ( riy) = exp ( mi ) ( ) 
Thus when p = N, one arrives back at the identity, equivalent to p = 0. 


The representation space of this group is undefined. It can represent translations 
or shifts along a circle for a complex scalar field. Z2 is sometimes thought of 
as a reflection symmetry of a scalar field, i.e. Z2 = {1, —1} and @ > —@. An 
action which depends only on ¢” has this symmetry. 

Usually Zy is discussed as an important sub-group of very many continuous 
Lie groups. The presence of Zy as a sub-group of another group usually 
signifies some multi-valuedness or redundancy in that group. For example, 
the existence of a Z sub-group in the Lie group SU(2) accounts for the 
double-valued nature of electron spin. 
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8.6.7 O(N): the orthogonal group 


The orthogonal group consists of all matrices which satisfy 
UTU =I (8.134) 


under normal matrix multiplication. In other words, the transpose of each matrix 
is the inverse matrix. All such matrices are real, and thus there are (N? — N)/2+ 
n = N(N +1)/2 real components in such a matrix. This is the dimension of the 
group. The orthogonal group is non-Abelian for N > 2 and is trivial for n = 1. 

The special orthogonal group is the sub-group of O(N) which consists of 
matrices with unit determinant. This reduces the dimension of the group by one, 
giving N(N — 1)/2 independent components. 


8.6.8 SO(3): the three-dimensional rotation group 


This non-Abelian group has three independent components corresponding to 
rotations about three-independent axes in a three-dimensional space. The group 
elements may be parametrized by the rotation matrices g; about the given axis i: 


1 0 0 
U,= | 0 cos sing (8.135) 
0 —sinO, cos0; 


cosh% 0 —sin@ 
U, = 0 1 0 (8.136) 
sind, 0 cos, 


cos@3 sinh; 0 
U, = | -sinh sind; O |. (8.137) 
0 0 1 


The representation space of this group is a three-dimensional Euclidean space 
and the transformations rotate three-dimensional vectors about the origin, pre- 
serving their lengths but not their directions. Notice that these matrices do not 
commute; i.e. a rotation about the x axis followed by a rotation about the y axis, 
is not the same as a rotation about the y axis followed by a rotation about the x 
axis. 


8.6.9 SO(2): the two-dimensional rotation group 


This group has only one element, corresponding to rotations about a point in a 
plane. Any element of SO (2) may be written in the form 


cosô sin 
E= ( —sin cos ). (5:138) 
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The representation space of this group is a two-dimensional Euclidean space, 
and the transformation rotates two-component vectors about the origin. Notice 
how the matrices parametrizing SO (3) are simply rotations of SO(2) embedded 
in a three-dimensional framework. 


8.7 Universal cover groups and centres 


We know that groups can contain other groups, as sub-groups of the whole, 
and therefore that some are larger than others. The universal cover group is 
defined to be a simply connected group which contains an image of every point 
in a given Lie group. If we consider an arbitrary Lie group, in general it will 
have companion groups which are locally the same, but globally different. The 
best known example of this is the pair SU(2) and SO(3), which are locally 
isomorphic, but globally different. In fact SU (2) contains two images of SO (3) 
or covers it twice, or contains two equivalent copies of it. Taking this a step 
further, if three groups have the same local structure, then they will all be sub- 
groups of the universal cover groups. 

If we begin with a Lie algebra, it is possible to exponentiate the generators of 
the algebra to form group elements: 


@=64T45>G=e®. (8.139) 


The group formed by this exponentiation is not unique; it depends on the 
particular representation of the algebra being exponentiated. For instance, 
the 2 x 2 representation of SU (2) exponentiates to SU(2), while the 3 x 3 
representation of SU (2) exponentiates to SO(3). Both of these groups are 
locally isomorphic but differ in their centres. In the case of SU(2) and SO(3), 
we can relate them by factorizing out the centre of the universal cover group, 


SU (2)/Z2 = SO(3). (8.140) 


From Schur’s lemma, we know that the centre of a group is only composed 
of multiples of the identity matrix, and that, in order to satisfy the rules of group 
multiplication, they must also have modulus one. It follows from these two facts 
that any element of the centre of a group can be written 


gc = exp(+271g/N)I, q=0,...,N—-1. (8.141) 


These elements are the Nth roots of unity for some N (in principle infinite, but 
in practice usually finite). If we start off with some universal cover group then, 
whose centre is Zy, there will be many locally isomorphic groups which can 
be found by factoring out sub-groups of the centre. The largest thing one can 
divide out is Zy itself, i.e. the whole centre. The group formed in this way is 
called the adjoint group, and it is generated by the adjoint representation: 


group 


———— = adjoint group. (8.142) 
centre of group 
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Table 8.1. Some common Lie algebras and groups. 


Algebra Centre Cover 
An ZN SU(N — 1) 
By Z2 SOQN + 1) 
Cn Z2 Sp(2N) 
Dy Z4 (Nodd) SO(2N) 
Z2 x Zz (Neven) 
E6 Z3 E6 
Go, F4, Eg Z3 


But it is not necessary to factor out the entire centre, one can also factor out a 
sub-group of the full centre; this will also generate a locally isomorphic group. 
For example, SU (8) has centre Zg. We can construct any of the following 
locally isomorphic groups: 


SU (8) SU (8)/Zs SU (8)/Z4 SU (8)/Z2. (8.143) 


Some well known Lie groups are summarized in table 8.1. 


8.7.1 Centre of SU(N) is Zy 


SU(N) is a simply connected group and functions as its own universal cover 
group. As the set of N x N matrices is the fundamental, defining representation, 
it is easy to calculate the elements of the centre. From Schur’s lemma, we know 
that the centre must be a multiple of the identity: 


go =aly. (8.144) 


where Iy is the N x N identity matrix. Now, SU(N) matrices have unit 
determinant, so 


det Iy =a = 1. (8.145) 


Thus, the solutions for œ are the Nth roots of unity, Zy. 


8.7.2 Congruent algebras: N -ality 


Since roots and weights of representations can be drawn as vectors in the Cartan 
sub-space, different representations produce similar, but not identical, patterns. 
Elements FE, of the algebra step through chains of solutions, creating a laced 
lattice-work pattern. Representations which exponentiate to the same group 
have patterns which are congruent to one another [124]. 
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Congruence is a property of discrete sets. The correct terminology is 
“congruent to x modulo m’. The property is simplest to illustrate for integers. x 
is said to be conjugate to y modulo m if y — x is an integer multiple of m: 


y=x+km, (8.146) 


for integer k, m. Congruence modulo m is an equivalence relation, and it sorts 
numbers into classes or congruent sets. The patterns made by congruent sets 
can be overlain consistently. The equivalence class, Æx, is the set of all integers 
which can be found from x by adding integer multiples m to it: 
E, = {x +km | integer k} 
={...,-2m+x,-—m+x,x,x +m,x+2m,...}. (8.147) 

There are exactly m different congruence classes modulo m, and these partition 
the integers; e.g. for m = 4, we can construct four classes: 

Eo = {..., —8, —4, 0, 4, 8, ...} 

Ei = {e.3. HT S33 199; 9, 2s} 

E = {..., —6, —2, 2,6, 10, ...} 

Ez = {...,—-5,—-1,3,7, 11,...}. (8.148) 
Lie algebra representations can also be classified into congruence classes. 
Historically, congruence classes of SU(N) modulo N are referred to as N-ality 
as a generalization of ‘triality’ for SU(3). Each congruence class has a label 


q; q = 0 corresponds to no centre, or the adjoint congruence class. The well 
known algebras contain the following values [56]: 


q= y ak (modn+1) for A, (8.149) 
k=l 

q = Qn (mod 2) for Bn (8.150) 

gd=a+a3+a5 (mod 2) for Cn (8.151) 

q = dı — A. + 4 — Os (mod 3) for E6 (8.152) 

q = d4 + +07 (mod 2) for E7 (8.153) 

q=0 for all representations of E7, Eg, F4, Go. (8.154) 


In the special case of D,, the congruence classes require classification by a 
two-component vector: 
qı = (n—-1 + On, 2; +03 +-+: 
+ 2an—2 + (n — 2)&n-1 + Nay, +--+) (mod 2) odd n 
q2 = (Gn-1 + On, 201 + 203 +--- 
+ 2an—3 + (n — 2)Qn_1 + nay) (mod 4) even n. 
(8.155) 
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The congruence is modulo the order of the centre. The algebra D, requires a 
two-dimensional label, since its centre is two-dimensional. E7, Eg, F4, and G2 
all have trivial centres, thus they all lie in a single class congruent to the adjoint. 


8.7.3 Simple and semi-simple Lie algebras 


A Lie algebra is simple if it has no proper invariant sub-algebras; i.e if only 
one element (the identity) commutes with every other in the group. A simple 
algebra is necessarily semi-simple. A semi-simple Lie algebra can be written in 
block-diagonal form, as a direct sum of invariant sub-algebras, each of which is 
a simple Lie algebra 


A=A,;@A2@0A3@--- Ay, (8.156) 


Le. it factorizes into block-diagonal form with simple blocks. A semi-simple 
algebra has no Abelian invariant sub-algebras. 


8.8 Summary 


The existence of a symmetry in a physical system means that it is possible to re- 
label parameters of a model without changing its form or substance. Identify the 
symmetries of a physical system and one can distinguish between the freedom 
a system has to change and the constraints which hold it invariant: symmetries 
are thus at the heart of dynamics and of perspective. 

Symmetries form groups, and can therefore be studied with the group theory. 
Since a symmetry means that some quantity Rg does not change, when we vary 
the action with respect to a parameter £, conservation of Rẹ is also linked to 
the existence of the symmetry. All of the familiar conservation laws can be 
connected to fundamental symmetries. 

In the case of electromagnetism, Lorentz covariance was exposed just by 
looking at the field equations and writing them in terms of (3 + 1) dimensional 
vectors. The chapters which follow examine the transformations which change 
the basic variables parametrizing the equations of motion, and the repercussions 
such transformations have for covariance. 
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Spacetime transformations 


An important class of symmetries is that which refers to the geometrical dis- 
position of a system. This includes translational invariance, rotational invariance 
and boosts. Historically, covariant methods were inspired by the fact that the 
speed of light in a vacuum is constant for all inertial observers. This follows 
from Maxwell’s equations, and it led Einstein to the special theory of relativity 
and covariance. The importance of covariance has since been applied to many 
different areas in theoretical physics. 

To discuss coordinate transformations we shall refer to figure 9.1, which 
shows two coordinate systems moving with a relative velocity v = Bc. The 
constancy of the speed of light in any inertial frame tells us that the line element 
(and the corresponding proper time) must be invariant for all inertial observers. 
For a real constant Q, this implies that 


ds? = Rds’? = Q?(—c7dt? + dx - dx). (9.1) 


This should not be confused with the non-constancy of the effective speed of 
light in a material medium; our argument here concerns the vacuum only. This 
property expresses the constancy, or x-independence, of c. The factor Q? is 
of little interest here as long as it is constant: one may always re-scale the 
coordinates to absorb it. Normally one is not interested in re-scaling measuring 
rods when comparing coordinate systems, since it only make systems harder to 
compare. However, we shall return to this point in section 9.7. 

For particles which travel at the speed of light (massless particles), one has 
ds? = 0 always, or 


dx 


Now, since ds? = 0, it is clearly true that Q7(x) ds? = 0, for any non-singular, 
non-zero function Q(x). Thus the value of c is preserved by a group of 
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Ss! 


relative velocity 


ee 


Fig. 9.1. The schematic arrangement for discussing coordinate transformations. Co- 
ordinate systems S(x) and S’(x’) are in relative motion, with speed v = fc. 


transformations which obey 
ds? = 2?(x)ds”. (9.3) 


This set of transformations forms a group called the conformal group. 

If all particles moved at the speed of light, we would identify this group as 
being the fundamental symmetry group for spacetime. However, for particles 
not moving at c, the line element is non-zero and may be characterized by 


Te 9.4 
ap te (9.4) 


for some constant B = v/c. Since we know that, in any frame, a free particle 
moves in a straight line at constant velocity, we know that 6 must be a constant 
and thus 


ds? = ds? £0. (9.5) 


If it were possible for an x-dependence to creep in, then one could transform 
an inertial frame into a non-inertial frame. The group of transformations which 
preserve the line element in this way is called the inhomogeneous Lorentz group, 
or Poincaré group. 
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In the non-relativistic limit, coordinate invariances are described by the so- 
called Galilean group. This group is no smaller than the Lorentz group, but 
space and time are decoupled, and the speed of light does not play a role at 
all. The non-relativistic limit assumes that c —> oo. Galilean transformations 
lie closer to our intuition, but they are often more cumbersome since space and 
time must often be handled separately. 


9.1 Parity and time reversal 


In an odd number of spatial dimensions (n = 2/+1), a parity, or space-reflection 
transformation P has the following non-zero tensor components: 


Pi = -1, (9.6) 


where i is not summed in the last line. When this transformation acts on another 
tensor object, it effects a change of sign on all space components. In other words, 
each spatial coordinate undergoes x! + —xŻ. The transformation A > —A is 
the discrete group Z2 = {1, —1}. 

In an even number of spatial dimensions (n = 21), this construction does not 
act as a reflection, since the combination of an even number of reflections is not 
a reflection at all. In group language, (Z2)°™” = {1}. It is easy to check that, in 
two spatial dimensions, reflection in the x; axis followed by reflection in the x2 
axis is equivalent to a continuous rotation. To make a true reflection operator in 
an even number of space dimensions, one of the spatial indices must be left out. 
For example, 


Si Sct A otal) 
P =+1 =n). (9.7) 


x 
Il 


The time reversal transformation in any number of dimensions performs the 
analogous function for time coordinates: 


T =-—1 
Ti =1. (9.8) 


$ 


These transformations belong to the Lorentz group (and others), and are 
sometimes referred to as large Lorentz transformations since they cannot be 
formed by integration or repeated combination of infinitesimal transformations. 
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9.2 Translational invariance 
A general translation in space, or in time, is a coordinate shift. A scalar field 
transforms simply: 
(x) > d(x + Ax). (9.9) 
The direction of the shift may be specified explicitly, by 


P(t, x') > P(t, x! + Ax’) 

p(t, xi) > PE + At, x’). (9.10) 
Invariance under such a constant shift of a coordinate is almost always a 
prerequisite in physical problems found in textbooks. Translational invariance 
is easily characterized by the coordinate dependence of Green functions. Since 
the Green function is a two-point function, one can write it as a function of x 
and x’ or in terms of variables rotated by 45 degrees, a (x — x’) and ze +x’). 


These are more conveniently defined in terms of a difference and an average 
(mid-point) position: 


x= -(x +x’). (9.11) 
The first of these is invariant under coordinate translations, since 
x—x = (x +a)-— (x +a). (9.12) 


The second equation is not, however. Thus, in a theory exhibiting translational 
invariance, the two-point function must depend only on x = x — x’. 


9.2.1 Group representations on coordinate space 


Translations are usually written in an additive way, 
x” —> x" + ab, (9.13) 


but, by embedding spacetime in one extra dimension, dg = (n+ 1) + 1, one can 
produce a group vector formulation of the translation group: 


xh 1 a” xh! 
(7 )=(6 Dan (9.14) 


This has the form of a group vector multiplication. The final 1 in the column 
vector is conserved and plays only a formal role. This form is common in 
computer representations of translation, such as in computer graphics. 
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A representation of translations which is particularly important in quantum 
mechanics is the differential coordinate representation. Consider an infinites- 
imal translation a, = €,. This transformation can be obtained from an 
exponentiated group element of the form 


U (e) = exp (i047 “) (9.15) 
by writing 


U (€) = exp (ie, k”) exp (ie, p?/ Xn) = (1+ ie, p?/ xn), (9.16) 


where 
Pu = XhKu = ~i Xn Op. (9.17) 
The action of the infinitesimal group element is thus 
x" —> U (e)x" = (1 + xn €Pdpx") =x" +e? nf =x" +e". (9.18) 
The reason for writing the generator, 
Te Pal Khs (9.19) 


in this form, is that p, is clearly identifiable as a momentum operator which 
satisfies 


[x, p] =1xa. (9.20) 


Thus, it is the momentum divided by a dimensionful scale (i.e. the wavenumber 
k,,) which is the generator of translations. In fact, we already know this from 
Fourier analysis. 

The momentum operator closely resembles that from quantum mechanics. 
The only difference is that the scale x; (with dimensions of action), which is 
required to give p, the dimensions of momentum, is not necessarily A. It is 
arbitrary. The fact that fi is small is the physical content of quantum mechanics; 
the remainder is group theory. What makes quantum mechanics special and 
noticeable is the non-single-valued nature of the exponentiated group element. 
The physical consequence of a small x, is that even a small translation will 
cause the argument of the exponential to go through many revolutions of 277. If 
Xn is large, then this will not happen. Physically this means that the oscillatory 
nature of the group elements will be very visible in quantum mechanics, but 
essentially invisible in classical mechanics. This is why a wavelike nature is 
important in quantum mechanics. 
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9.2.2 Bloch’s theorem: group representations on field space 


Bloch’s theorem, well known in solid state physics, is used to make predictions 
about the form of wavefunctions in systems which have periodic potentials. 
In metals, for instance, crystal lattices look like periodic arrays of potential 
wells, in which electrons move. The presence of potentials means that the 
eigenfunctions are not plane waves of the form 


eek): (9.21) 


for any x, x’. Nevertheless, translational invariance by discrete vector jumps a; 
is a property which must be satisfied by the eigenfunctions 


p(t, x +a) = U (a) g(t, x) = e' 8 h(t, x). (9.22) 


9.2.3 Spatial topology and boundary conditions 


Fields which live on spacetimes with non-trivial topologies require boundary 
conditions which reflect the spacetime topology. The simplest example of this 
is the case of periodic boundary conditions: 


d(x) =ag(x+L), (9.23) 


for some number aw. Periodic boundary conditions are used as a model for 
homogeneous crystal lattices, where the periodicity is interpreted as translation 
by a lattice cell; they are also used to simulate infinite systems with finite 
ones, allowing the limit L — oo to be taken in a controlled manner. Periodic 
boundary conditions are often the simplest to deal with. 

The value of the constant œ can be specified in a number of ways. Setting it 
to unity implies a strict periodicity, which is usually over-restrictive. Although 
it is pragmatic to specify a boundary condition on the field, it should be noted 
that the field itself is not an observable. Only the probability P = (¢, @) and 
its associated operator P are observables. In Schrödinger theory, for example, 
Ê = w*(x)W(x), and one may have w(x + L) = e@ w(x) and still preserve 
the periodicity of the probability. 

In general, if the field ¢ (x) is a complex field or has some multiplet symmetry, 
then it need only return to its original value up to a gauge transformation; thus 
a = U(x). Fora multiplet, one may write 


a(x +L) = UB (x) Og (x). (9.24) 


The transformation U is the exponentiated phase factor belonging to the 
group of symmetry transformations which leaves the action invariant. This is 
sometimes referred to as a non-integrable phase. Note that, for a local gauge 
transformation, one also has a change in the vector field: 


A, (x + L) = BA, (x). (9.25) 
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This kind of transformation is required in order to obtain a consistent energy— 
momentum tensor for gauge symmetric theories (see section 11.5). The value of 
B depends now on the type of couplings present. From the spacetime symmetry, 
a real field, A,, has only a Z reflection symmetry, i.e. 6 = +1, which 
corresponds heuristically to ferromagnetic and anti-ferromagnetic boundary 
conditions. Usually 6 = 1 to avoid multiple-valuedness. 

In condensed matter physics, conduction electrons move in a periodic poten- 
tial of crystallized valence ions. The potential they experience is thus periodic: 


V(x) =VK«+L), (9.26) 
and it follows that, for plane wave eigenfunctions, 
Plt, x +L) = UL) g(t, x) = È g(t, x). (9.27) 


This is a straightforward application of the scalar translation operator; the result 
is known as Bloch’s theorem. 

On toroidal spacetimes, i.e. those which have periodicities in several direc- 
tions, the symmetries of the boundary conditions are linked in several directions. 
This leads to boundary conditions called co-cycle conditions [126]. Such 
conditions are responsible for flux quantization of magnetic fields in the Hall 
effect [65, 85]. 

In order to define a self-consistent set of boundary conditions, it is convenient 
to look at the so-called Wilson loops in the two directions of the torus, since they 
may be constructed independently of the eigenfunctions of the Hamiltonian. 
Normally this is presented in such a way that any constant part of the vector 
potential would cancel out, giving no information about it. This is the co-cycle 
condition, mentioned below. The Wilson line is defined by 


W(x) = P exp gi Aj as). (9.28) 
Xo f 


j not summed, for some fixed point Xo. It has an associated Wilson loop W; (L') 
around a cycle of length L', in the x; direction by 


The notation here means that the path-dependent Wilson line W;(x) returns to 
the same value multiplied by a phase W; (L', X) on translation around a closed 
curve from x; to xj + L’. The coordinate dependence of the phase usually 
arises in the context of a uniform magnetic field passing through the torus. In 
the presence of a constant magnetic field strength, the two directions of the torus 
are closely linked, and thus one has 


W (u: + Lı, u2) = exp [iziu + ie |Win, u2) (9.30) 


Wr(u4, uy + La) = exp ficzLa} Wot, u2). (9.31) 
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At this stage, it is normal to demonstrate the quantization of flux by opening out 
the torus into a rectangle and integrating around its edges: 


Wi (u2 + L2)W2(ui)W, | (u2)W5 (u, + Li) = 1. (9.32) 


This is known as the co-cycle condition, and has the effect of cancelling the 
contributions to the c’s and thus flux quantization is found independently of 
the values of c; due to the nature of the path. The most general consistency 
requirement for the gauge field (Abelian or non-Abelian), which takes into 
account the phases c;, has been constructed in ref. [18]. 

The results above imply that one is not free to choose, say, periodic boundary 
conditions for bosons and anti-periodic boundary conditions for fermions in the 
presence of a uniform field strength. All fields must satisfy the same consistency 
requirements. Moreover, the spectrum may not depend on the constants, c;, 
which have no invariant values. One may understand this physically by noting 
that a magnetic field causes particle excitations to move in circular Landau 
orbits, around which the line integral of the constant vector potential is null. The 
constant part of the vector potential has no invariant meaning in the presence of 
a magnetic field. 

In more complex spacetimes, such as spheres and other curved surfaces, 
boundary conditions are often more restricted. The study of eigenfunctions 
(spherical harmonics) on spheres shows that general phases are not possible 
at identified points. Only the eigenvalues +1 are consistent with a spherical 
topology [17]. 


9.3 Rotational invariance: SO(n) 


Rotations are clearly of special importance in physics. In n spatial dimensions, 
the group of rotations is the group which preserves the Riemannian, positive 
definite, inner product between vectors. In Cartesian coordinates this has the 
well known form 


x: y — x! Yi. (9.33) 
The rotation group is the group of orthogonal matrices with unit determinant 
SO(n). Rotational invariance implies that the Green function only depends on 
squared combinations of this type: 


G(x, x’) = G (@1 — x1)? + @2 = x2)? +--+ + Gn = x). (9.34) 


The exception here is the Dirac Green function. 
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9.3.1 Group representations on coordinate space 


Three-dimensional rotations are generated by infinitesimal matrices: 


00 0 

T'=| 00 ~i 

O i 0 

0 0 i 

T= 0 0 0 

—i 0 0 

0 —i 0 
T’=ļ|i 0 0 (9.35) 

0 0 0 

which satisfy a Lie algebra 

[T;, Tj] = i€ijk Tk. (9.36) 


These exponentiate into the matrices for a three-dimensional rotation, 
parametrized by three Euler angles, 


1 0 0 
R, =U,={| 0 cos sind; (9.37) 
0 —sin@, cosh; 
cosh 0 —sin@ 
R, = U, = 0 1 0 (9.38) 
sinĝ 0 cos 
cos@3 sin63; 0 
R, =U, =| —sin@ sin% 0 |. (9.39) 
0 0 1 
The rotation group is most often studied in n = 3 dimensions, for obvious 


reasons, though it is worth bearing in mind that its properties differ quite 
markedly with n. For instance, in two dimensions it is only possible to have 
rotation about a point. With only one angle of rotation, the resulting rotation 
group, SO(2), is Abelian and is generated by the matrix 


Bao | (9.40) 


This exponentiates into the group element 


cos@ sind 
C= ( —sin@ cosé ). ony) 
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A two-dimensional world can also be represented conveniently by adopting 
complex coordinates on the Argand plane. In this representation, a vector is 
simply a complex number z, and a rotation about the origin by an angle 0 is 
accomplished by multiplying: 


z> ez. (9.42) 


9.3.2 Eigenfunctions: circular and spherical harmonics 


The eigenfunctions of the rotation operators form a set of basis functions which 
span representation space. The rotational degrees of freedom in quantum fields 
can be expanded in terms of these eigenfunctions. 


Eigenfunctions in n = 2 In two dimensions, there is only a single axis of 
rotation to consider. Then the action of the rotation operator 7; has the form 


—109 |) = Al). (9.43) 
This equation is trivially solved to give 
ip} = ^. (9.44) 


In two spatial dimensions, there are no special restrictions on the value of A. 
Notice that this means that the eigenfunctions are not necessarily single-valued 
functions: under a complete rotation, they do not have to return to their original 
value. They may differ by a phase: 


I + 2m) = e0 — et cide (9.45) 


where ô = 2Az. In higher dimensions 6 must be unity because of extra 
topological restrictions (see below). 


Eigenfunctions in n = 3 The theory of matrix representations finds all of 
the irreducible representations of the rotation algebra in n = 3 dimensions. 
These are characterized by their highest weight, or spin, with integral and 
half-integral values. There is another approach, however, which is to use a 
differential representation of the operators. The advantage of this is that it is then 
straightforward to find orthonormal basis functions which span the rotational 
space. 

A set of differential operators which satisfies the Lie algebra is easily 
constructed, and has the form 


T=rx iV, (9.46) 
or 


T; = 1€; jk Xj ok. (9.47) 
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This has the form of an orbital angular momentum operator L = rxp, and 
it is no coincidence that it re-surfaces also in chapter 11 in that context with 
only a factor of A to make the dimensions right. It is conventional to look for 
the simultaneous eigenfunctions of the operators Lı and L? by writing these 
operators in spherical polar coordinates (with constant radius): 


Lı =i (sing dg + cot@ cos@ dg) 
L =i (— cos f dg + cot 0 sin dy) 
L3 = —i0¢, (9.48) 


and 


1 
L’ = Sng oe (sind d) + dg. (9.49) 


sin sin? 0 
The eigenvectors and eigenvalues involve two angles, and may be defined by 


L? |b, 0) =T’|9, 0) 
L3|, 0) = Aele, 8). (9.50) 


The solution to the second equation proceeds as in the two-dimensional case, 
with only minor modifications due to the presence of the other coordinates. The 
eigenfunctions are written as a direct product, 


|p, 9) = O@)P@), (9.51) 


so that one may identify ®(@) with the solution to the two-dimensional problem, 
giving 


|p, 0) = O@) el? (9.52) 


The values of A, are not arbitrary in this case: the solution of the constraints 
for the 6 coordinate imposes extra restrictions, because of the topology of a 
three-dimensional space. Suppose we consider a rotation through an angle of 
2x in the ¢ direction in the positive and negative directions: 


lb +27) = eiAcl+2m) _ gid gided 
lo — 27) = eI) = oh I (0.53) 


In two spatial dimensions, these two rotations are distinct, but in higher 
dimensions they are not. This is easily seen by drawing the rotation as a circle 
with an arrow on it (see figure 9.2). By flipping the circle about an axis in its 
plane we can continuously deform the positive rotation into the negative one, 
and vice versa. This is not possible in n = 2 dimensions. This means that they 
are, in fact, different expressions of the same rotation. Thus, 


e’ = e™ = +1, (9.54) 
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Fig. 9.2. Exchange of particles in two and three spatial dimensions. In the plane, 
there is only one rotation about the centre of mass which exchanges identical particles. 
Clockwise and anti-clockwise are inequivalent. In three dimensions or greater, one may 
rotate this plane around another axis and deform clockwise into anti-clockwise. 


These two values are connected with the existence of two types of particle: 
bosons and fermions, or 


1 
Nes Oye Ebi (9.55) 


for integer m. Note that, in older texts, it was normal to demand the single- 
valuedness of the wavefunction, rather than using the topological argument 
leading to eqn. (9.54). If one does this, then only integer values of A, are 
found, and there is an inconsistency with the solution of the group algebra. 
This illustrates a danger in interpreting results based on coordinate systems 
indiscriminately. The result here tells us that the eigenfunctions may be either 
single-valued for integer Ac, or double-valued for half-integral A.. In quantum 
mechanics, it is normal to use the notation 


T?=1(1+1) (9.56) 
Ac =m. (9.57) 


If we now use this result in the eigenvalue equation for L?, we obtain 


be (ap Vn Fig) e ashy (9.58) 
— — | SING — = y= VU. % 
sin 0 d0 do sin? 0 


Putting z = cos 0 in this equation turns it into the associated Legendre equation, 


a Vanes wale eal aa w lasi 9.59 
az ( 2) ) fe? = Us (9.59) 
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where P = ©(cos@). The solutions of the associated Legendre equation may 
be found for integral and half-integral values of A., though most books ignore 
the half-integral solutions. They are rather complicated, and their form is not 
specifically of interest here. They are detailed, for instance, in Gradshteyn and 
Ryzhik [63]. Since the magnitude of L3 cannot exceed that of L?, by virtue of 
the triangle (Schwartz) inequality, 


m? <1 +1), (9.60) 
or 
—l<m<l. (9.61) 
The rotational eigenfunctions are 
|L, m) = Nim P/"(cos6)e'”®, (9.62) 
with normalization factor 
Nim = con [72 : ' r 21 (9.63) 


These harmonic eigenfunctions reflect the allowed boundary conditions for 
systems on spherical spacetimes. They also reflect particle statistics under the 
interchange of identical particles. The eigenvalues of the spherical harmonics 
are +1 in 3 + 1 dimensions, corresponding to (symmetrical) bosons and 
(anti-symmetrical) fermions; in 2 + 1 dimensions, the Abelian rotation group 
has arbitrary boundary conditions corresponding to the possibility of anyons, or 
particles with ‘any’ statistics [83, 89]. 


9.4 Lorentz invariance 
9.4.1 Physical basis 


The Lorentz group is a non-compact Lie group which lies at the heart of 
Einsteinian relativistic invariance. Lorentz transformations are coordinate 
transformations which preserve the relativistic scalar product 


xty, = x° y? + xi yi, (9.64) 
and therefore also the line element 
ds? = gudx”dx”. (9.65) 


Lorentz transformations include, like the Galilean group, translations, rotations 
and boosts, or changes of relative speed. Under a linear transformation of x”, 
we may write generally 


x —> x" = UM +a", (9.66) 
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where a“ is a constant translation and 


əx” 


U! = 
” Ox" 


(9.67) 


is constant. 


9.4.2 Lorentz boosts and rotations 


A boost is a change of perspective from one observer to another in relative 
motion to the first. The finite speed of light makes boosts special in Einsteinian 
relativity. If we refer to figure 9.1 and consider the case of relative motion along 
the x! axis, such that the two frames S and S’ coincide at x? = 0, the Lorentz 
transformation relating the primed and unprimed coordinates may be written 


x” = y(x — Bx!) = x’ cosha — x! sinha 

x" = y(x! — Bx°) = x! cosha — x’ sinha 

x =a 

xP =x, (9.68) 


where 


y =1/y1- $? 


Bi =v'/c 
B= VB'Bi 
a = tanh”! £. (9.69) 


The appearance of hyperbolic functions here, rather than, say, sines and cosines 
means that there is no limit to the numerical values of the group elements. 
The group is said to be non-compact. In matrix form, in (3 + 1) dimensional 
spacetime we may write this: 


y —yßf 0 0 cosha —sinha 0 0 
—yß y ae —sinha cosha 0 0 
Ce 0 0 10] 0 0 1 0 
0 0 0 1 0 0 0 1 

(9.70) 


where the ‘rapidity’ œ = tanh™! 8. This may be compared with the explicit form 
of a rotation about the x! axis: 


1 0 0 0 
0 1 0 0 

Le) = 0 0 cos@ —sin@ (9.71) 
0 O sinô cos8 
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Notice that the non-trivial parts of these matrices do not overlap. This leads 
to an important result, which we shall derive below, namely that rotations 
and boosts are independent transformations which can be used to parametrize 
general transformations. 

The form of these matrix representations makes it clear that the n-dimensional 
group of rotations, SO(n), is a sub-group with irreducible representations 


bis) = ( 3 ). 0.72) 


and similarly that boosts in a single direction also form a sub-group. General 
boosts in multiple directions do not form a group, however. 

The form of a general boost can be derived as a generalization of the formulae 
in eqns. (9.68) on the basis of general covariance. We can write a general form 
based on figure 9.1 and eqns. (9.68) 


dx” = y (dx? — Bi dx;) 


dx!’ = y (« 5, +c oe) dx/ — y Bi dx?. (9.73) 


The unknown coefficients label projection operators for longitudinal and trans- 
verse parts with respect to the n-component velocity vector 6’. By squaring the 
above expressions and using the invariance of the line element 


ds? = —(dx°)? + (dxi)? = —(dx")? + (dx’’)?, (9.74) 
giving 
— (dx)? = —y? ((dx°)? — 2(Bidxj)dx? + (B'dx;)”) , (9.75) 


and 


(dx)? = (4 Ôjk + (2cıc2 + c2) Pik) dx/dx* 
+ y?B?(dx°)? — 2y (c1 + €2)(B'dxj)dx°®, (9.76) 


one compares the coefficients of similar terms with the untransformed ds? to 
obtain 


C1 =] 
@Q=y-l. (9.77) 


Thus, in 1 +7 block form, a general boost may be written as 


y =yp' 
ETR Hier) ve 
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9.4.3 The homogeneous Lorentz group: SO(1, n) 


It is convenient to divide the formal discussion of the Lorentz group into two 
parts. In the first instance, we shall set the inhomogeneous term, a,,, to zero. A 
homogeneous coordinate transformation takes the form 


xt > x" Tee (9.79) 


where L,» is a constant matrix. It does not include translations. After a 
transformation of the line element, one has 


ds? = i y(x' dx dx” 
= gum (x) L" L" dx’dxÀ. (9.80) 
The metric must compensate for this change by transforming like this: 
Em) = hPL Se sO). (9.81) 


This follows from the above transformation property. We can see this in matrix 
notation by considering the constant metric tensor 7,,, = diag(—1, 1, 1, 1,...), 
which must be invariant if the scalar product is to be preserved. In a Cartesian 
basis, we have 


xP Yu = Nv x" yY” = Nay (Lx)*(Ly)” 
x™ ny = (Lx)" n (Ly) 
Log Dy Tay (9.82) 


Comparing the left and right hand sides, we have the matrix form of eqn. (9.81) 
in a Cartesian basis: 


n=L'nL. (9.83) 


The matrices L form a group called the homogeneous Lorentz group. We 
can now check the group properties of the transformation matrices L. The 
existence of an associative combination rule is automatically satisfied since ma- 
trix multiplication has these properties (any representation in terms of matrices 
automatically belongs to the general linear group G(n, R)). Thus we must show 
the existence of an inverse and thus an identity element. Acting on the left of 
eqn. (9.83) with the metric 


nLinL=rn=1=L"'L, (9.84) 
where / is the identity matrix belonging to GL(n, R). Thus, the inverse of L is 
L-'=nL'n. (9.85) 

In components we have 
Ga)" SL a. (9.86) 


Since the transpose matrix is the inverse, we can write the Lorentz group as 
SOI, 3). 
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Dimension and structure of the group The symmetry in (n + 1)? components of 
Ly,» implies that not all of the components may be chosen independently. The 
fact that only half of the off-diagonal components are independent means that 
there are 


n+ -antl 


dg 5 


(9.87) 


independent components in n+1 dimensions, given by the independent elements 
of © to be defined below. Another way of looking at this is that there are (n + 
1)? components in the matrix L i , but the number of constraints in eqn. (9.83) 
limits this number. Eqn. (9.83) tells us that the transpose of the equation is 
the same, thus the independent components of this equation are the diagonal 
pieces plus half the off-diagonal pieces. This is turn means that the other half of 
the off-diagonal equations represent the remaining freedom, or dimensionality 
of the group. dg is the dimension of the inhomogeneous Lorentz group. The 
components of 


SuvL'L", = Sap 
may be written out in 1 + n form, u = (0, i) form as follows: 
LYL*, goo + Ligh’y gij = 800 


L°, LÌ, goo + LÉ L'o gu = gio = 0 
LY L®, goo + LÄL' ; gij = Bij (9.88) 


This leads to the extraction of the following equations: 


(LY? = 1+ L’ L'o 
L L?, = L4 Lro 
L° LP + Lu L” = 83). (9.89) 


These may also be presented in a schematic form in terms of a scalar S, a vector 
V andann x n matrix M: 


S vi ) 
Lt = i), (9.90) 
( V; Mij 
giving 
S=14V'V; 
SV'=V'M 


I=M'M+VV'. (9.91) 
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It is clear from eqn. (9.90) how the n-dimensional group of rotations, SO(n), 
is a sub-group of the homogeneous Lorentz group acting on only the spatial 
components of spacetime vectors: 


1 0 
L} (R) = ( 0 R; ). (9.92) 


Notice that it is sufficient to know that L = 1 to be able to say that a Lorentz 
transformation is a rotation, since the remaining equations then imply that 


M'M=R'R=I, (9.93) 


i.e. that the n-dimensional sub-matrix is orthogonal. The discussion of the 
Lorentz group can, to a large extent, be simplified by breaking it down into 
the product of a continuous, connected sub-group together with a few discrete 
transformations. The elements of the group for which detL = +1 form a 
sub-group which is known as the proper or restricted Lorentz group. From 
the first line of eqn. (9.89) or (9.91), we have that L% > lor LY < -l. 
The group elements with L°) > 1 and detL = +1 form a sub-group called 
the proper orthochronous Lorentz group, or the restricted Lorentz group. This 
group is continuously connected, but, since there is no continuous change of 
any parameter that will deform an object with det L = +1 into an object with 
det L = —1 (since this would involve passing through det L = 0), this sub-group 
is not connected to group elements with negative determinants. We can map 
these disconnected sub-groups into one another, however, with the help of the 
discrete or large Lorentz transformations of parity (space reflection) and time 
reversal. 


Group parametrization and generators The connected part of the homogeneous 
Lorentz group may be investigated most easily by considering an infinitesimal 
transformation in a representation which acts directly on spacetime tensors, i.e. 
a transformation which lies very close to the identity and whose representation 
indices A, B are spacetime indices u,v. This is the form which is usually 
required, and the only form we have discussed so far, but it is not the only 
representation of the group, as the discussion in the previous chapter should 
convince us. We can write such an infinitesimal transformation, L (e€), in terms 
of a symmetric part and an anti-symmetric part, without loss of generality: 


Lie) =I1+€(®+0), (9.94) 


where @ is an anti-symmetric matrix, and J and @ together form the symmetric 
part. ée is a vanishingly small (infinitesimal) number. Thus we write, with 
indices, 


L P(e) =8? +€(@?2 +5). (9.95) 
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Note that, for general utility, the notation commonly appearing in the literature 
is used here, but beware that the notation is used somewhat confusingly. Some 
words of explanation are provided below. Substituting this form into eqn. (9.81) 
gives, to first order in €, 


Bum LEL”, = Bp, + €(@px + Opr + @ip + Dip) +++ + OC). 


(9.96) 
Comparing the left and right hand sides of this equation, we find that 
Diy = =O 
Ouv = —Oy, = 0. (9.97) 


Thus, the off-diagonal terms in L(€) are anti-symmetric. This property survives 
exponentiation and persists in finite group elements with one subtlety, which is 
associated with the indefinite metric. We may therefore identify the structure of 
a finite Lorentz transformation, L, in spacetime block form. Note that a Lorentz 
transformation has one index up and one down, since it must map vectors to 
vectors of the same type: 


LS by bo (9.98) 


There are two independent (reducible) parts to this matrix representing boosts 
u,v = 0,7 and rotations u,v = i, j. Although the generator ©,» is purely 


anti-symmetric, the 0, i components form a symmetric matrix under transpose 
since the act of transposition involves use of the metric: 


(Li) = -L = L. (9.99) 


The second, with purely spatial components, is anti-symmetric since the gen- 
erator is anti-symmetric, and the metric leaves the signs of spatial indices 
unchanged: 


AT , 
(Ey) =-Li. (9.100) 
Thus, the summary of these two may be written (with both indices down) 


Luv = —Lyy. (9.101) 
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The matrix generators in a (3 + 1) dimensional representation for the Lorentz 
group in (3 + 1) spacetime dimensions, T42 = T“”, are given explicitly by 


0io0o0 
iooo 
Tau =| 0000 
0000 
00i0 
pea, | 00-00 
at ioooO 
0000 
000 i 
m 9 0 0 0 
ate 0000 
iooo 
00 0 0 
0 0 =i 0 
Ta =1 oi 0 0 
00 0 0 
000 0 
r z |000 0 
31 — | 0 0 0 =i 
00i 0 
0000 
0 0 0i 
T=] 9 6 6 (9.102) 
0 -i 0 0 


Note that, because of the indefinite metric, only the spatial components of these 
generators are Hermitian. This will lead us to reparametrize the components in 
terms of positive definite group indices below. It is now conventional, if not a 
little confusing, to write a general infinitesimal Lorentz transformation in the 
form 

l. 
=i 
2 


where J and Tp are the identity and generator matrices of a given representation 
Gr. In terms of their components A, B, 


Ur = Lp@) = Ir + zio TH”, (9.103) 


i 


Uh = rhoi 


Opo lT Tr- (9.104) 
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The second term here corresponds to the second term in eqn. (9.95), but the 
spacetime-specific indices u in eqn. (9.95) have now been replaced by repre- 
sentation indices A, B, anticipating a generalization to other representations. 
A general finite element of the group in a representation Gp is obtained by 
exponentiation, 


LA, = exp (žo [re"T) (9.105) 


Let us take a moment to understand this form, since it appears repeatedly in the 
literature without satisfactory explanation. The w”” which appears here is not 
the same as e@"” a priori (but see the next point). In fact, it plays the role of 
the group parameters 0° in the previous chapter. Thus, in the language of the 
previous chapter, one would write 


i 
Ug = Lg (©) = 9% + 5O°ITRIs 


Toes (e [T.'I‘s) l (9.106) 


It is easy to see that the use of two indices is redundant notation, since most 
of the elements of the generators are zeros. It is simply a convenient way to 
count to the number of non-zero group dimensions dg in terms of spacetime 
indices u, v = 0,..., n+ 1 rather than positive definitea, b = 1, ..., dg indices 
of the group space. The factor of 5 in eqn. (9.105) accounts for the double 
counting due to the anti-symmetry in the summation over all jz, v indices. The 
fact that two indices are used in this summation, rather than the usual one index 
in T“, should not lead to confusion. To make contact with the usual notation for 
generators, we may take the (3 + 1) dimensional case as an example. In (3 + 1) 
dimensions, the homogeneous Lorentz group has dg = 6, and its complement 
of generators may be written: 


T? = [Bah Ba, Ba, Dy, Ban, Bp h, (9.107) 
where a = 1,..., 6 and the group elements in eqn. (9.105) have the form 
exp (i0¢ T°) ; (9.108) 


The first three T° are the generators of boosts (spacetime rotations), while the 
latter three are the generators of spatial rotations. The anti-symmetric matrix of 
parameters @,,, contains the components of the rapidity a’ from eqn. (9.68) as 
well as the angles 6’ which characterize rotations. Eqn. (9.105) is general for 
any representation of the Lorentz group inn + 1 dimensions with an appropriate 
set of matrix generators Tyv. 
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Lie algebra in 3+ 1 dimensions The generators above satisfy a Lie algebra 
relation which can be written in several equivalent forms. In terms of the 
two-index parametrization, one has 


[Tr”, TRO] =i (N9 TR? + NITR — NTR? — PTY). 
(9.109) 


This result applies in any number of dimensions. To see this, it is necessary to 
tie up a loose end from the discussion of the parameters w,, and €@,,, above. 
While these two quantities play formally different roles, in the way they are 
introduced above they are in fact equivalent to one another and can even be 
defined to be equal. This is not in contradiction with what is stated above, where 
pains were made to distinguish these two quantities formally. The resolution of 
this point comes about by distinguishing carefully between which properties 
are special for a specific representation and which properties are general for all 
representations. Let us try to unravel this point. 

The Lorentz transformation is defined in physics by the effect it has on 
spacetime reference frames (see figure 9.1). If we take this as a starting 
point, then we must begin by dealing with a representation in which the 
transformations act on spacetime vectors and tensors. This is the representation 
in which A, B — uv, and we can write an infinitesimal transformation as in 
eqn. (9.95). The alternative form in eqn. (9.104) applies for any representation. 
If we compare the two infinitesimal forms, it seems clear that @,, plays the 
role of a generator 74, and in fact we can make this identification complete by 
defining 


ie (9.110) 


This is made clearer if we make the identification again, showing clearly the 
representation specific indices: 


7 i jf 
ca", = 5 [on TE | vi (9.111) 
This equation is easily satisfied by choosing 
[TE] ~ nin. (9.112) 


However, we must be careful about preserving the anti-symmetry of 73,1, so we 
have 
ao JA 2 1 Avo oA 
[rN = TA 5 (a? ng —n’gn?*). (9.113) 
Clearly, this equation can only be true when A, B representation indices belong 
to the set of (3 + 1) spacetime indices, so this equation is only true in one 
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representation. Nevertheless, we can use this representation-specific result to 
determine the algebra relation which is independent of representation as follows. 
By writing 


vq] ; v v 
[T] B7! (n“^n B — ngn 4) 
o 1B 3 ö a 
ar eat ae San) (9.114) 
it is straightforward to compute the commutator, 
[Tes TR Tc» (9.115) 


in terms of ņ tensors. Each contraction over B leaves a new 7 with only 
spacetime indices. The remaining ņ’s have mixed A, u indices and occur in 
pairs, which can be identified as generators by reversing eqn. (9.113). The result 
with A, C indices suppressed is given by eqn. (9.109). In fact, the expression is 
uniform in indices A, C and thus these ‘cancel’ out of the result; more correctly 
they may be generalized to any representation. 

The representations of the restricted homogeneous Lorentz group are the 
solutions to eqn. (9.109). The finite-dimensional, irreducible representations can 
be labelled by two discrete indices which can take values in the positive integers, 
positive half-integers and zero. This may be seen by writing the generators in 
a vector form, analogous to the electric and magnetic components of the field 
strength F”” in (3 + 1) dimensions: 


; ; 1 : 
is ee 5 eel = (Eee ie) 
K=TiijcS CST TTO): (9.116) 

These satisfy the Lie algebra commutation rules 

[Th, Tj] = ie T} 

[Th, Tei] = iet T/c? 

(Ti, TJ] = ie TE. (9.117) 

Also, as with electromagnetism, one can construct the invariants 


1 


TTS 5 Tele = T; —Tz/c’ 
1 v o i 
ae Tk TE” = —T} Tg;/c. (9.118) 


These quantities are Casimir invariants. They are proportional to the identity 
element in any representation, and thus their values can be used to label the 
representations. From this form of the generators we obtain an interesting 
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perspective on electromagnetism: its form is an inevitable expression of the 
properties of the Lorentz group for vector fields. In other words, the constraints 
of relativity balanced with the freedom in a vector field determine the form of 
the action in terms of representations of the restricted group. 

The structure of the group can be further unravelled and related to earlier 
discussions of the Cartan—Wey] basis by forming the new Hermitian operators 


1 
E; = 5 Xh (Ts +iTg/c) 


1 
Prag xia Sile) (9.119) 
which satisfy the commutation rules 


[E;, Ej] = i Xn Eijk Ek 
[F;, Fj] = i Xn Eijk Fk 
[E;, F;] = 0. (9.120) 


The scale factor, Xp, is included for generality. It is conventional to discuss 
angular momentum directly in quantum mechanics texts, for which x, — h. 
For pure rotation, we can take x, = 1. As a matter of principle, we choose 
to write x, rather than A, since there is no reason to choose a special value 
for this scale on the basis of group theory alone. The special value x, = A is 
the value which is measured for quantum mechanical systems. The restricted 
Lorentz group algebra now has the form of two copies of the rotation algebra 
su(2) in three spatial dimensions, and the highest weights of the representations 
of these algebras will be the two labels which characterize the full representation 
of the Lorentz group representations. 

From the commutation rules (and referring to section 8.5.10), we see that the 
algebra space may be spanned by a set of basis vectors ((2A max + 1)(2A fax + 1) 
of them). It is usual to use the notation 


Ac Xh Me, mf) 
Amax = Xn (e, f) (9.121) 


in physics texts, where they are referred to as quantum numbers rather than 
algebra eigenvalues. Also, the labels jı, j2 are often used for e, f, but, in the 
interest of a consistent and unique notation, it is best not to confuse these with 
the eigenvalues of the total angular momentum J; which is slightly different. 
In terms of these labels, the Lorentz group basis vectors may be written as 
le,me; f,my), where —e < me < e, —f < my < f, ande,me, f,my take 
on integer or half-integer values. The Cartan—Weyl stepping operators are then, 
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by direct transcription from section 8.5.10, 


Ex|e,me; fmf) = (E1 £iE2)|e,m-; fimp) 


Xn y (e F me)(e +m. +1) le,m. £1; fimp) 
E3Ie, Me; Í, mf) = Xh mele, Me; Í, mf) (9.122) 


and 


Fale; Mies f, Mz) (Fi £iFy)le, Me; fimp) 


= xn y (f FMS Emp +1) le, mea fimp +1) 
Fzļe,mj; e, me) = Xn Mmple, me; f,mp). (9.123) 


The algebra has factorized into two su(2) sub-algebras. Each irreducible repre- 
sentation of this algebra may be labelled by a pair (e, f), which corresponds to 
boosts and rotations, from the factorization of the algebra into E and F parts. 
The number of independent components in such an irreducible representation 
is (2e + 1)(2f + 1) since, for every e, f can run over all of its values, and 
vice versa. The physical significance of these numbers lies in the extent to 
which they may be used to construct field theories which describe a real physical 
situations. Let us round off the discussion of representations by indicating how 
these irreducible labels apply to physical fields. 


9.4.4 Different representations of the Lorentz group in 3 + 1 dimensions 


The explicit form of the Lorentz group generators given in eqns. (9.102) is 
called the defining representation. It is also the form which applies to the 
transformation of a spacetime vector. Using this explicit form, we can compute 
the Casimir invariants for E; and F; to determine the values of e and f which 
characterize that representation. It is a straightforward exercise to perform the 
matrix multiplication and show that 


' 1 3 
E? = FE'E; = 3 Xa (T5 — Tele") = 3 Xa’ Ba, (9.124) 


where I3}; is the identity matrix for the defining representation. Now, this 
value can be likened to the general form to determine the highest weight of 
the representation e: 


3 
E? = 7 Xr? lyi = ele + 1) Xn? By, (9.125) 
whence we deduce that e = $. The same argument may be applied to F?, with 


the same result. Thus, the defining representation is characterized by the pair of 
numbers (e, f) = G, +). 
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The Lorentz transformations have been discussed so far in terms of tensors, 
but the independent components of a tensor are not always in an obvious form. 
A vector, for instance, transforms as 


A" — L* A”, (9.126) 
but a rank 2-tensor transforms with two such Lorentz transformation matrices 
A > Lh L's A”. (9.127) 


The independent components of a rank 2-tensor might be either diagonal or 
off-diagonal, and there might be redundant zeros or terms which are identical 
by symmetry or anti-symmetry, but one could think of re-writing eqn. (9.127) 
in terms of a single larger matrix acting on a new vector where only the 
independent components were present, rather than two smaller matrices acting 
on a tensor. Again, this has to do with a choice of representations. We just pick 
out the components and re-write the transformations in a way which preserves 
their content, but changes their form. 

Suppose then we do this: we collect all of the independent components of any 
tensor field into a column vector, 


a) 
Airin 5 | (9.128) 
po ’ . 
aN 


where N is the total number of independent components in the object being 
acted upon, and is therefore the dimension of this representation. The array of 
matrices L, (one for each index) can now be replaced by a single matrix Le 
which will have as many independent components as the product of the L’s. 
Often such a single matrix will be reducible into block-diagonal form, i.e. a 
direct sum of irreducible representations. 

The irreducible blocks of any (3+ 1) spacetime-dimensional Lorentz transfor- 
mation of arbitrary representation dr are denoted D®:/)(Gp). A tensor trans- 
formation of rank N might therefore decompose into a number of irreducible 
blocks in equivalent-vector form: 


be = DED a DED.. @ Den fy), (9.129) 


The decomposition of a product of transformations as a series of irreducible 
representations 


D” 8 D® = Sem D“ (9.130) 
® 


is called the Clebsch—Gordon series. The indices A, B run over 1,..., (2e + 
1)(2f + 1) for each irreducible block. For each value of e, we may take all the 
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Table 9.1. Spin/helicity properties of some representations of the Lorentz group in 
(3 + 1) dimensions. 


The number of degrees of freedom (D.F) 6 = (e+ 1)2f + 1). Note that the 
electromagnetic field F,» lacks the longitudinal mode ms = 0 of the massive vector 
field A,. 


Representation ‘Spin’ D.F. Description 
(e, f) m=e+f ¢ọ 
G, 0) 5 2 Weyl 2-spinor 
(0, $) 5 2 Weyl 2-spinor 
(0, 0) 0 1 trivial scalar 
G, 0) @ (0, 5) +4 4 Dirac 4-spinor 
G, 5) 0,1 4 4-vector A, 
(1,0) 6 (0, 1) +1 6 anti-symm. F,,, 


(1,1)® (1,0) (0,1) (0,0) 0,1,2 16 rank 2-tensor 


values of f in turn, and vice versa. So which representation applies to which 
field? We can look at this in two ways. 


e We see that e, f are allowed by the general solution of the Lorentz 
symmetry. The values are O, $, 1,.... We then simply construct fields 
which transform according to these representations and match them with 
physical phenomena. 


e We look at fields which we know about (¢, A,,, guy, - - -) and determine 
what e, f these correspond to. 


Some common values of ‘spin’ are listed in table 9.1. Counting the highest 
weights of the blocks is not difficult, but to understand the difference between a 
massless vector field and a massive vector field, for example (both with highest 
spin weight 1), we must appreciate that these fields have quite different space- 
time transformation properties. This is explained by the fact that there are two 
ways in which a spin 1 field can be constructed from irreducible representations 
of the Lorentz group, and they form inequivalent representations. Since we are 
dealing with the homogeneous Lorentz group in a given frame, the spin is the 
same as the total intrinsic angular momentum of the frame, and is defined by a 
sum of the two vectors 


Si = Ji = Ei + F;, (9.131) 
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with maximum helicity s given by e + f; the range of allowed values follows 
in integer steps from the rules of vector addition (see section 11.7.4). The 
maximum value is when the vectors are parallel and the minimum value is when 
they are anti-parallel. Thus 


s=+t(e+f),t(e+ f—1),...,+le— fl. (9.132) 


The spin s is just the highest weight of the Lorentz representation. Of all 
the representations which one might construct for physical models, we can 
narrow down the possibilities by considering further symmetry properties. Most 
physical fields do not change their properties under parity transformations 
or spatial reflection. Under a spatial reflection, the generators E;, F; are 
exchanged: 


Prep = F; 
PFEP! = Ej. (9.133) 
In order to be consistent with spatial reflections, the representations of parity- 


invariant fields must be symmetrical in (e, f). This means we can either make 
irreducible representations of the form 


(e, e) (9.134) 
or symmetrized composite representations of the form 


(e, f) ® (fe), (9.135) 


such that exchanging e <> f leaves them invariant. 


Helicity values for spin 1 For example, a spin 1 field can be made in two ways 
which correspond to the massless and massive representations of the Poincaré 
algebra. In the first case, a spin 1 field can be constructed with the irreducible 
transformational properties of a vector field, 


aes 9.136 
(3-3): oe 


A field of this type would exist in nature with spin/helicities s = 0, +1. These 
correspond to: (i) 25+ 1 = 1, i.e. one longitudinal scalar component Ag, and (ii) 
2s+1 = 3, a left or right circularly polarized vector field. This characterizes the 
massive Proca field, A,,, which describes W and Z vector bosons in the electro- 
weak theory. However, it is also possible to construct a field which transforms 
as 


(1,0) 6 (0, 1). (9.137) 
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The weight strings from this representation have only the values m, = +1, the 
left and right circular polarizations. There is no longitudinal zero component. 
The values here apply to the photon field, F',,. The symmetrization corre- 
sponds to the anti-symmetry of the electromagnetic field strength tensor. The 
anti-symmetry is also the key to understanding the difference between these two 
representations. 

One reason for looking at this example is that, at first glance, it seems 
confusing. After all, the photon is also usually represented by a vector potential 
A,,, but here we are claiming that a vector formulation is quite different from 
an anti-symmetric tensor formulation. There is a crucial difference between the 
massive vector field and the massless vector field, however. The difference can 
be expressed in several equivalent ways which all knit together to illuminate the 
theme of representations nicely. 

The physical photon field, F,,,, transforms like a tensor of rank 2. Because 
of its anti-symmetry, it can also be written in terms of a massless 4-vector 
potential, which transforms like a gauge-invariant vector field. Thus, the 
massless vector field is associated with the anti-symmetric tensor form. The 
massive Proca field only transforms like a vector field with no gauge invariance. 
The gauge invariance is actually a direct manifestation of the difference in trans- 
formation properties through a larger invariance group with a deep connection 
to the Lorentz group. The true equation satisfied by the photon field is 


ð F” = (8% —d"d,)A, =0, (9.138) 


while the Proca field satisfies 


(0 +m’)A, =0. (9.139) 


This displays the difference between the fields. The photon field has a degree 
of freedom which the Proca field does not; namely, its vector formulation is 
invariant under 


A > A, + (8,5), (9.140) 


for any scalar function s(x). The Proca field is not. Because of the gauge 
symmetry, for the photon, no coordinate transformation is complete without an 
associated, arbitrary gauge transformation. A general coordinate variation of 
these fields illustrates this (see section 4.5.2). 


Photon field ô, A” = e, F”# 
Proca field ô, A” =€,(0"A"). 


The difference between these two results is a gauge term. This has the 
consequence that the photon’s gauge field formulation behaves like an element 
of the conformal group, owing to the spacetime-dependent function s(x). This 
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is very clearly illustrated in section 11.5. The gauge field A, must transform 
like this if the tensor F,, = 0,A, — 0,A, which derives from it is to transform 
like an element of the Lorentz group. The same is not true of the Proca field, 
A,,, Which is simply a vector field without complication. 

Appearances can therefore be deceptive. The spin 1 vector fields might 
look the same, but the gauge invariance of the gauge field associates it with 
an anti-symmetric second-rank tensor. The anti-symmetric property of the 
photon tensor endows it with a property called transversality, which means 
that the physical excitations of the field E;, B; are transverse to the direction 
of propagation (i.e. to the direction of its momentum or wavenumber) k'. This 
is not the case for the Proca field. It has components of its field in the direction 
of motion, i.e. longitudinal components. The extra s = 0 mode in the helicity 
values for the Proca field corresponds to a longitudinal mode. 

For a massless field travelling in the x? direction, k, = (k, 0, 0, k). Transver- 
sality means that 


k' F;„ = 0! F;, = 0, (9.141) 
which is guaranteed by Maxwell ’s equations away from sources. In gauge form, 
k' A; =0, (9.142) 


which can always be secured by a gauge transformation. For the massive vector 
field, the lack of gauge invariance means that this condition cannot be secured. 


9.4.5 Other spacetime dimensions 


In a different number of spacetime dimensions n + 1, the whole of the above 
(3 + 1) dimensional procedure for finding the irreducible representations must 
be repeated, and the spin labels must be re-evaluated in the framework of a new 
set of representations for the Lorentz group. This will not be pursued here. 


9.4.6 Factorization of proper Lorentz transformations 


From the discussion of the Lie algebra above, one sees that an arbitrary element 
of the proper or restricted Lorentz group can be expressed as a product of a 
rotation and a boost. This only applies to the restricted transformations, and 
is only one possible way of parametrizing such a transformation. The result 
follows from the fact that a general boost may be written as 


y —yß' 
L(B) = ( n E ety — af ) (9.143) 
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and a rotation may be written 
1 0 
L(R) = ( 0 Ry ). (9.144) 


The result can be shown starting from a general Lorentz transformation as in 
eqn. (9.98). Suppose we operate on this group element with an inverse boost (a 
boost with 6' > —f': 


=] = Y —yp' De Lo 
Ele ( —ypi j+- 1) Se ( L? Ee}? (159) 


l 


where we define the velocity to be 


b= (2) (9.146) 
Ly, 
This makes 
a Sy (9.147) 


and it then follows from eqns. (9.89) that this product has the form 


1 


sy = ; 
L`! (B)L = ( 0 m 


) = L(R). (9.148) 
This result is clearly a pure rotation, meaning that we can rearrange the formula 
to express the original arbitrary proper Lorentz transformation as a product of a 
boost and a rotation, 


L = L(B)L(R). (9.149) 


9.4.7 The inhomogeneous Lorentz group or Poincaré group in 3 + 1 
dimensions 


If the inhomogeneous translation term, a,,, is not set to zero in eqn. (9.66), one 
is led to a richer and more complex group structure [137]. This is described by 
the so-called inhomogeneous Lorentz group, or Poincaré group. It is a synthesis 
of the physics of translations, from earlier in this chapter, and the fixed origin 
behaviour of the homogeneous Lorentz group. The most general transformation 
of this group can be written 


x" = LÉ x” +a", (9.150) 


where a” is an x“-independent constant translation vector. These transfor- 
mations cannot be represented by a dp = 4 representation by direct matrix 
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multiplication, but a dR = 5 representation is possible, by analogy with 
eqn. (9.14), by embedding in one extra dimension: 
H H 
Uz4141x“ = ( T 4 ) ( P ) = x” +a". (9.151) 


The generic infinitesimal Poincaré transformation may be written 
i 
2 
for some scale x, with dimensions of action. Inspired by the differential 
representation for the translation group, we find a differential form for the 


homogeneous Lorentz group, which might be combined with the translation 
group in a straightforward way. These are: 


at OTe tie Ke, (9.152) 


rit = acuta” =a) 
es. 
. EuT” = 5 xh Eijk (Xj OK — XK0;) 
Ki = Toi 
Pu = Xh Ku = —i Xn 9p- (9.153) 


An important difference between the inhomogeneous Lorentz group and the 
homogeneous Lorentz group is that the total angular momentum generator, J;, 
is no longer just the intrinsic angular momentum of a field, but it can include 
orbital angular momentum about a point displaced from the origin. This means 
that we have to be more careful than before in distinguishing spin s from 
j = e + k by defining it only in an inertial rest frame with zero momentum. 
It is easily verified that these representations satisfy the algebra relations. Using 
these forms, it is a matter of simple algebra to evaluate the full algebraic content 
of the Poincaré group: 


[ky Too] = —i(Nupko = Nuokp), (9.154) 
or equivalently 
[ko, Ji] = 0 
[ki, Ji] = —i Xn €iimkm.- (9.155) 


These relations are trivial statements about the transformation properties of ko 
(scalar) and k; (vector) under rotations. Using the definitions above, we also find 
that 


[ko, Ki] = iki 
[ki, Kj] = —i Xn nijko.- (9.156) 


These relations show that a boost K; affects ko, kj, but not k; for j Ai. 
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Massive fields It is a curious feature of the Poincaré group, which comes about 
because it arises in connection with the finite speed of light, that the mass of 
fields plays a role in their symmetry properties. Physically, massless fields are 
bound to move at the speed of light so they have no rest frame about which to 
define intrinsic properties, like spin, which depend on coordinate concepts. It 
is therefore necessary to find another way to characterize intrinsic rotation. We 
can expect mass to play a role since it is linked to the momentum, which is the 
generator of translations. 
The Poincaré group leaves invariant the relation 


pc +m?c* = const, (9.157) 


where py, = (mc, pi). This is, in fact, a Casimir invariant, p“p,, up 
to dimensional factors. Recall from the discussion of translations that the 
momentum may be written 


Pic is (9.158) 


where k,, is the wavenumber or reciprocal lattice vector. As in the case of the 
other groups, we can label the field by invariant quantities. Here we have the 
quadratic Casimir invariants 


P =jG+D X? 
p=pe +m, (9.159) 


which commute with the group generators and are thus independent of symme- 
try basis: 


[p’, pul =0 
[p° Ji] =0 
[p*, Ki] = 0. (9.160) 


A covariant rotation operator can be identified which will be useful for dis- 
cussing intrinsic in chapter 11. It is called the Pauli-Lubanski vector, and it is 
defined by 


1 vA 
W, = 7 Xh Euvap T p’. (9.161) 
The quadratic form, W?, is Lorentz- and translation-invariant: 
[W?, Pul =0 
[W?, Tav] = 0. (9.162) 


W satisfies 


W“p, =0 (9.163) 
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and 
[W,, Wi) = i€ uvpo W?’ p° (9.164) 
1 
W? = 5 Xn? T! Tyu p° + Xn? TET, Popas (9.165) 


If we consider W, in a rest frame where p; = 0, we have 


1 
Wrest = —mc(0, Ji, J2, J3)rest = — sme, Si, S2, $3), (9.166) 


where S; may be thought of as the intrinsic (non-orbital) rotation of the field 
(called spin of the representation), which is defined by 


SS (9.167) 
rest 


Thus, W* is clearly a 4-vector with the properties of intrinsic rotations in a rest 
frame. Indeed, evaluating eqn. (9.164) in a rest frame, we find that 


[W;, Wj] = —imc eij W4. (9.168) 
Or setting W; = —mc J;, we recover the rotational algebra 
[Ji, Jj] =i xn Eijk J". (9.169) 


Thus the Poincaré-invariant quadratic form is 


2 
Wrest 


=m? J? =m e j(j +1) xn? Ir. (9.170) 


For classifying fields, we are interested in knowing which of the properties of 
the field can be determined independently (or which simultaneous eigenvalues 
of the symmetry operators exist). Since the rest mass m is fixed by observation, 
we need only specify the 3-momentum, p;, to characterize linear motion. From 
eqns. (9.155), we find that J; and p; do not commute so they are not (non- 
linearly) independent, but there is a rotation (or angular momentum) which does 
commute with p;. It is called the helicity and is defined by 


A= Jp, (9.171) 


where p’ is a unit vector in the direction of the spatial 3-momentum. The 
commutator then becomes 


[pi Jjlp! =i xn ijxp* p! = 0. (9.172) 


Thus, A can be used to label the state of a field. A state vector is therefore 
characterized by the labels (‘quantum numbers’ in quantum mechanics) 


|O) = Įm, j, p, A), (9.173) 
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i.e. the mass, the linear momentum, the highest weight of the rotational 
symmetry and the helicity. In a rest frame, the helicity becomes ill defined, so 
one must choose an arbitrary component of the spin, usually m ; as the limiting 
value. 

We would like to know how these states transform under a given Poincaré 
transformation. Since the states, as constructed, are manifestly eigenstates of 
the momentum, a translation simply incurs a phase 


|©) > exp (ip“a,) |@). (9.174) 


Homogeneous Lorentz transformations can be used to halt a moving state. 
The state |m, j, p’, à} can be obtained from |m, j, 0, 5;) by a rotation into the 
direction of p; followed by a boost exp(i6' K;) to set the frame in motion. Thus 


Im, j, p', A) = L |m, j, 0, si). (9.175) 


The sub-group which leaves the momentum p, invariant is called the little group 
and can be used to classify the intrinsic rotational properties of a field. For 
massive fields in 3+ 1 dimensions, the little group is covered by SU (2), but this 
is not the case for massless fields. 


Massless fields For massless fields, something special happens as a result of 
motion at the speed of light in a special direction. It is as though a field is 
squashed into a plane, and the rotational behaviour becomes two-dimensional 
and Abelian. The direction of motion decouples from the two orthogonal 
directions. Consider a state of the field 


Oz) = |m, $, T, A), (9.176) 


where the momentum z,, = 7 (1,0,0, 1) is in the x? direction, and the Lorentz 
energy condition becomes p°c? = 0 or po = |p;|. This represents a ‘particle’ 
travelling in the x? direction at the speed of light. The little group, which leaves 
Pn invariant, may be found and is generated by 


A;=Ji+ Ki 
Ag=J, — K; 
A3 = J. (9.177) 


Clearly, the x? direction is privileged. These are the generators of the two- 
dimensional Euclidean group of translations and rotations called ZSO (2) or 
E2. It is easily verified from the Poincaré group generators that the little group 
generators commute with the momentum operator 


[Ai, Pul lOr) = 0. (9.178) 
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The commutation relations for A; are 


[A3, Ai] = iA 
[A3, A2] = —iA, 
[A;, A2] = 0. (9.179) 


The last line signals the existence of an invariant sub-group. Indeed, one can 
define a Cartan—Wey] form and identify an invariant sub-algebra H, 


Ex = Ay i42 
H = As, (9.180) 
with Casimir invariant 
Co = AZ + AS 
0 = [Co, Aj]. (9.181) 
The stepping operators satisfy 
[H, E+] = £ Ez, (9.182) 


ie. Ag = +1. This looks almost like the algebra for su(2), but there is 
an important difference, namely the Casimir invariant. ^, does not occur in 
the Casimir invariant since it would spoil its commutation properties (it has 
decoupled). This means that the value of A, = m; is not restricted by the 
Schwarz inequality, as in section 8.5.10, to less than tAmax = +j. The stepping 
operators still require the solutions for A, = mj; to be spaced by integers, but 
there is no upper or lower limit on the allowed values of the spin eigenvalues. 
In order to make this agree, at least in notation, with the massive case, we label 
physical states by A; only, taking 


Aj|Oxz) = A2|Oz) = 0. (9.183) 


Thus, we may take the single value H = A3 = A, = m; = A to be the angular 
momentum in the direction x?, which is the helicity, since we have taken the 
momentum to point in this direction. See section 11.7.5 for further discussion 
on this point. 


9.4.8 Curved spacetime: Killing’s equation 


In a curved spacetime, the result of an infinitesimal translation from a point can 
depend on the local curvature there, i.e. the translation is position-dependent. 
Consider an infinitesimal inhomogeneous translation é” (x), such that 


xh —> LE x” +e" (x). (9.184) 
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Then we have 


ax! 
“_ = LË + (ae), (9.185) 
ox? 
and 
ds”? = guy (LH, + (dpe"))(L", + (ds€”))dxPdx? 


= guv [LY LY + LY OE”) + (pe) LY, +--+ + O(€*)] dx?dx*. 
(9.186) 


The first term here vanishes, as above, owing to the anti-symmetry of œP. 
Expanding the second term using eqn. (9.95), and remembering that both 
@pv and €,,(x) are infinitesimal so that €@,, is second-order and therefore 
negligible, we have an additional term, which must vanish if we are to have 
invariance of the line element: 


du€v + dé, = O. (9.187) 
The covariant generalization of this is clearly 
Vuév + Voen = 0. (9.188) 


This equation is known as Killing’s equation, and it is a constraint on the 
allowed transformations, €” (x), which preserve the line element, in a spacetime 
which is curved. A vector, €“(x), which satisfies Killing’s equation is called 
a Killing vector of the metric g,,,. Since this equation is symmetrical, it has 
$(n+1)?+(n+1) independent components. Since £” has only n+1 components, 
the solution is over-determined. However, there are s(n +1? — (n+!) 
anti-symmetric components in Killing’s equation which are unaffected; thus 
there must be 


1 
m=(mn+1)+50+D’- 0+1) (9.189) 
free parameters in the Killing vector, in the form: 
Vav + Véu = 0 
Eux) =a, + OuvX”, (9.190) 


where @,, = —@,,. A manifold is said to be ‘maximally symmetric’ if it has the 
maximum number of Killing vectors, i.e. if the line element is invariant under 
the maximal number of transformations. 


9.5 Galilean invariance 


The relativity group which describes non-Einsteinian physics is the Galilean 
group. Like the Poincaré group, it contains translations, rotations and boosts. 
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As a group, it is no smaller, and certainly no less complicated, than the Lorentz 
group. In fact, it may be derived as the c —> oo limit of the Poincaré group. But 
there is one conceptual simplification which makes Galilean transformations 
closer to our everyday experience: the absence of a cosmic speed limit means 
that arbitrary boosts of the Galilean transformations commute with one another. 
This alters the algebra of the generators. 


9.5.1 Physical basis 


The Galilean group applies physically to objects moving at speeds much less 
than the speed of light. For this reason, it cannot describe massless fields at 
all. The care required in distinguishing massless from massive concepts in the 
Poincaré algebra does not arise here for that simple reason. An infinitesimal 
Galilean transformation involves spatial and temporal translations, now written 
separately as 


x” = xf + bx! 
t=t+ ot, (9.191) 


rotations by 6! = se ‘qj, and boosts by incremental velocity ôv’ 


il 


xt =x! svt. (9.192) 


This may be summarized by the standard infinitesimal transformation form 


x! = ¢ + som") x/ 
2 J 
x” = (1+i0)), x, (9.193) 
where the matrix 
O = kiôx' — ôt +0 Th + bv; TL. (9.194) 
The exponentiated translational part of this is clearly a plane wave: 
U ~ exp i(k - 6x — ©ôt ). (9.195) 
Galilean transformations preserve the Euclidean scalar product 


x-y =x’ yi. (9.196) 


9.5.2 Retardation and boosts 


Retardation is the name given to the delay experienced in observing the effect of 
a phenomenon which happened at a finite distance from the source. The delay is 
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caused by the finite speed of the disturbance. For example, the radiation at great 
distances from an antenna is retarded by the finite speed of light. A disturbance 
in a fluid caused at a distant point is only felt later because the disturbance 
travels at the finite speed of sound in the fluid. The change in momentum felt by 
a ballistic impulse in a solid or fluid travels at the speed of transport, i.e. the rate 
of flow of the fluid or the speed of projectiles emanating from the source. 

Retardation expresses causality, and it is important in many physical prob- 
lems. In Galilean physics, it is less important than in Einsteinian physics 
because cause and effect in a Galilean world (where v « c) are often assumed 
to be linked instantaneously. This is the Galilean approximation, which treats 
the speed of light as effectively infinite. However, retardation transformations 
become a useful tool in systems where the action is not invariant under boosts. 
This is because they allow us to derive a covariant form by transforming a 
non-covariant action. For example, the action for the Navier-Stokes equation 
can be viewed as a retarded snapshot of a particle field in motion. It is a snapshot 
because the action is not covariant with respect to boosts. We also derived a 
retarded view of the electromagnetic field arising from a particle in motion in 
section 7.3.4. 

Retardation can be thought of as the opposite of a boost transformation. A 
boost transformation is characterized by a change in position due to a finite 
speed difference between two frames. In a frame x’ moving with respect to a 
frame x we have 


xi(t) =xi(t)+0't. (9.197) 


Rather than changing the position variable, we can change the way we choose to 
measure time taken for the moving frame to run into an event which happened 
some distance from it: 


(x’ — x)! 


vi 


trer =t — (9.198) 
Whereas the idea of simultaneity makes this idea more complicated in the 
Einsteinian theory, here the retarded time is quite straightforward for constant 
velocity, v', between the frames. If we transform a system into a new frame, 
it is sometimes convenient to parametrize it in terms of a retarded time. To do 
this, we need to express both coordinates and derivatives in terms of the new 
quantity. Considering an infinitesimal retardation 
dx! 
fret =t — —_, (9.199) 


v! 


it is possible to find the transformation rule for the time derivative, using the 
requirement that 

dtret 

dtret 


=1. (9.200) 
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It may be verified that 


l dx’ 
[ð + v’ a] [ £ =| =1. (9.201) 
vJ 
Thus, one identifies 
Ls a, +v’ ð (9.202) 
aie. = 0; v j ‘ 


This retarded time derivative is sometimes called the substantive derivative. In 
fluid dynamics books it is written 
D d 
Dt dte 


(9.203) 


It is simply the retarded-time total derivative. Compare this procedure with the 
form of the Navier-Stokes equation in section 7.5.1 and the field of a moving 
charge in section 7.3.4. 


9.5.3 Generator algebra 


The generators Tg and Tg are essentially the same generators as those which 
arise in the context of the Lorentz group in eqn. (9.116). The simplest way 
to derive the Galilean group algebra at this stage is to consider the c —> oo 
properties of the Poincaré group. The symbols Tg and Tg help to identify 
the origins and the role of the generators within the framework of Lorentzian 
symmetry, but they are cumbersome for more pedestrian work. Symbols for the 
generators, which are in common usage are 


Ji =T} 
N' =T}. (9.204) 


These are subtly different from, but clearly related to, the symbols used for 
rotations and boosts in the Poincaré algebra. The infinitesimal parameters, 0°, 
of the group are 


0° = fôt, &x', 0°, 8v}. (9.205) 


In 3 + 1 dimensions, there are ten such parameters, as there are in the Poincaré 
group. These are related to the symbols of the Lorentz group by 


ôt =€°/c, (9.206) 
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and 


H+mc? = CPo = Xncko 
H = xn @ (= ho). (9.207) 


Note that the zero point is shifted so that the energy H does not include the rest 
energy mc? of the field in the Galilean theory. This is a definition which only 
changes group elements by a phase and the action by an irrelevant constant. 
The algebraic properties of the generators are the c > œœ limit of the Poincaré 
algebra. They are summarized by the following commutators: 


[ki, kj] =0 
[N;, Nj] =0 
[H,k;]=0 

[H, Ji] =0 
[H, Ni] =i Xn ki 


[ki, Ji] = —i Xn €itmkm 

[ki, Nj] = im Xn ôij 

[Ji, Ni] = i€itmNm 

LJi, Jj] = i€ijkJk, (9.208) 


where po/c —> m is the mass, having neglected H/c = x; @/c. The Casimir 
invariants of the Galilean group are 


Jİ Ji, ki K;, N'N;. (9.209) 


The energy condition is now the limit of the Poincaré Casimir invariant, which 
is singular and asymmetrical: 


=E (9.210) 


(see section 13.5). 


9.6 Conformal invariance 


If we relax the condition that the line element ds* must be preserved, and require 
it only to transform isotropically (which preserves ds? = 0), then we can allow 
transformations of the form 


ds? = —dt? + dx? + dy? + dz? 
> (x) (—dt? + dx? + dy? + dz’), (9.211) 
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where Q(x) is a non-singular, non-vanishing function of x“. In the action, we 
combine this with a similar transformation of the fields, e.g. inn +1 dimensions, 


d(x) > Q9-? g(x). (9.212) 


This transformation stretches spacetime into a new shape by deforming it with 
the function Q(x) equally in all directions. For this reason, the conformal 
transformation preserves the angle between any two lines which meet at a vertex, 
even though it might bend straight lines into curves or vice versa. 

Conformal transformations are important in physics for several reasons. They 
represent a deviation from systems of purely Markov processes. If a translation 
in spacetime is accompanied by a change in the environment, then the state 
of the system must depend on the history of changes which occurred in the 
environment. This occurs, for instance, in the curvature of spacetime, where 
parallel transport is sensitive to local curvature; it also occurs in gauge theories, 
where a change in a field’s internal variables (gauge transformation) accompa- 
nies translations in spacetime, and in non-equilibrium statistical physics where 
the environment changes alongside dynamical processes, leading to conformal 
distortions of the phase space. Conformal symmetry has many applications. 

Because the conformal transformation is a scaling of the metric tensor, its 
effect is different for different kinds of fields and their interactions. The number 
of powers of the metric which occurs in the action (or, loosely speaking, the 
number of spacetime indices on the fields) makes the invariance properties of the 
action and the field equations quite different. Amongst all the fields, Maxwell’s 
free equations (a massless vector field in) in 3 + 1 dimensions stand out for 
their general conformal invariance. This leads to several useful properties of 
Maxwell’s equations, which many authors unknowingly take for granted. Scalar 
fields are somewhat different, and are conformally invariant in 1+ 1 dimensions, 
in the massless case, in the absence of self-interactions. We shall consider these 
two cases below. 

Consider now an infinitesimal change of coordinates, as we did in the case of 
Lorentz transformations: 


x” —> AF x” cee Oe) (9.213) 
The line element need not be invariant any longer; it may change by 
ds” = (x) ds?. (9.214) 


Following the same procedure as in eqn. (9.186), we obtain now a condition for 
eqn. (9.214) to be true. To first order, we have: 


Q(x) Buv = Suv + Ipev + drén: (9.215) 


Clearly, €” and Q(x) must be related in order to satisfy this condition. The 
relationship is easily obtained by taking the trace of this equation, multiplying 
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through by g“”. This gives, in n + 1 dimensions, 
(Q? — 1)(n + 1) = 2(d,€*). (9.216) 


Using this to replace Q(x) in eqn. (9.215) gives us the equation analogous to 
eqn. (9.187), but now for the full conformal symmetry: 


Onév + Ove, = (O,€")Syv- (9.217) 


2 
(n+ 1) 
This is the Killing equation for the conformal symmetry. Its general solution in 
n + 1 dimensions, for n > 1, is 


e! (x) =a" + bx" + wx, + 2x" e'x, — chx?, (9.218) 


where œ” = —œ””. In (1 + 1) dimensional Minkowski space, eqn. (9.217) 
reduces to two equations 

d0€0 = —dO,€) 

d0€1 = — 01 €ọ. (9.219) 
In two-dimensional Euclidean space, i.e. n = 1, followed by a Wick rotation to 
a positive definite metric, this equation reduces simply to the Cauchy—Riemann 
relations for €“(x), which is solved by any analytic function in the complex 
plane. After a Wick rotation, one has 

d0€0 = 91€] 

d0€1 = — 01 €ọ. (9.220) 


To see that this is simply the Cauchy—Riemann relations, 


d 
oars = 0, 9.221 
T70 (9.221) 
we make the identification 
z= x + ix! 
f(z) = €o + 1€; (9.222) 
and note that 
d 
dz* 
This property of two-dimensional Euclidean space reflects the well known 


property of analytic functions in the complex plane, namely that they all are 
conformally invariant and solve Laplace’s equation: 


1 
=n (09 + i81). (9.223) 


V? f(x!) gs ge )=0 (9.224) 
~~ dz dz* aes í 
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It makes two-dimensional, conformal field theory very interesting. In particular 
it is important for string theory and condensed matter physics of critical 
phenomena, since the special analyticity allows one to obtain Green functions 
and conservation laws in the vicinity of so-called fixed points. 


9.6.1 Scalar fields inn + 1 dimensions 


We begin by writing down the action, making the appearance of the metric 
explicit: 


1{1 
= dix g7 E ae Vp = sol . (9.225) 


Note the factor of the determinant of the metric in the volume measure: this will 
also scale in the conformal transformation. We now let 


Suv > VAE 
g> VOD) yg 
pa) > QU? (x)b(x) 
J > YJ, (9.226) 


where @ is presently unknown. It is also useful to define the ‘connection’ T, = 
N2713, N2. We now examine the variation of the action under this transformation: 


1 _ gi = 
5S = J evren Haon (8, 20-2) 
C 


4 8V — Q07 so}. (9.227) 


Integrating by parts to separate 5@ gives 


1 


ôS = fos Vg +- 
C 

[-a +n — DYT, 22M? Spee" (a, RG) 
z. Q0-/2-2 5654” (8,3 R07) + v}. (9.228) 


Notice how the extra terms involving T „, which arise from derivatives acting 
on Q, are proportional to (1 +n — 2) = n — 1. These will clearly vanish 
in n = 1 dimensions, and thus we see how n = 1 is special for the scalar 
field. To fully express the action in terms of barred quantities, we now need 
to commute the factors of Q through the remaining derivatives and cancel them 
against the factors in the integration measure. Each time Q"'~”/” passes through 
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a derivative, we pick up a term containing (1 —n)I’,,, thus, provided we have 
a = —(n+3)/2 and dV = 0, we may write 


as = f atts Vg {OG - 7} 45 + terms x (n—1). (9.229) 


Clearly, in 1 + 1 dimensions, this equation is conformally invariant, provided 
the source J transforms in the correct way, and the potential V vanishes. The 
invariant equation of motion is 


Hox) = J. (9.230) 


9.6.2 The Maxwell field inn + 1 dimensions 


The conformal properties of the Maxwell action are quite different to those of 
the scalar field, since the Maxwell action contains two powers of the inverse 
metric, rather than one. Moreover, the vector source coupling J“ A, contains 
a power of the inverse metric because of the indices on A,,. Writing the action 
with metric explicit, we have 


1f1 
S= Jenae | Fue Fn + sga) ; (9.231) 
We now re-scale, as before, but with slightly different dimensional factors 


Suv > LXE yy 
g =y QOD (x) g 
A) > QA) 
Ja > YT, (9.232) 


and vary the action to find the field equations: 
1 il T o6- 8g” 
8S = J dt! /gQrt - [Loa ye 
+ FZ Qe gH l. (9.233) 
Integrating by parts, we obtain 
1 = 
ôS = eae: NE a? 5A, fin — 3)P g g” Foy 
= 8, Fag" g+ Lge, (9.234) 
On commuting the scale factor through the derivatives using 


1 = = = 
Op Fon = 58 = M3, [T An —PiAp] + OF pa, (9.235) 
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we see that we acquire further terms proportional to n — 3. Three dimensions 
clearly has a special significance for Maxwell’s equations, so let us choosen = 3 
now and use the notation 3, to denote the fact that the derivative is contracted 
using the transformed metric g,,,,. This gives 


1 ee eee = 
6S = J eive [3 +J al dA, = 0. (9.236) 
c 
Notice that the invariance of the equations of motion, in the presence of a 


current, depends on how the current itself scales. Suppose we couple to the 
current arising from a scalar field which has the general form J, ~ $*0,@, 


then, from the previous section, this would scale by Q"-!. Forn = 1, this 
gives precisely a = n — 1 = 2. Note, however, that the matter field itself 
is not conformally invariant in n = 3. As far as the electromagnetic sector 


is concerned, however, n = 3 gives us the conformally invariant equation of 
motion 


IF” =T”. (9.237) 


The above treatment covers only two of the four Maxwell ’s equations. The 
others arise from the Bianchi identity, 


eng Fap = 0. (9.238) 


The important thing to notice about this equation is that it is independent of the 
metric. All contractions are with the metric-independent, anti-symmetric tensor; 
the other point is precisely that it is anti-symmetric. Moreover, the field scale 
factor Q?” /2 is simply unity in n = 3, thus the remaining Maxwell equations 
are trivially invariant. 

In non-conformal dimensions, the boundary terms are also affected by the 
scale factor, Q. The conformal distortion changes the shape of a boundary, 
which must be compensated for by the other terms. Since the dimension 
in which gauge fields are invariant is different to the dimension in which 
matter fields are invariant, no gauge theory can be conformally invariant in flat 
spacetime. Conformally improved matter theories can be formulated in curved 
spacetime, however, in any number of dimensions (see section 11.6.3). 


9.7 Scale invariance 


Conformal invariance is an exacting symmetry. If we relax the x-dependence of 
Q(x) and treat it as a constant, then there are further possibilities for invariance 
of the action. Consider 


1 1 
s= fax) }50")0.0)+ Dad! (9.239) 
T 
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Table 9.2. Scale-invariant potentials. 


n=] n=2 n=3 


All gof Zags 


Let us scale 


Suv > ae Q? 
P(x) > pa) Q™, (9.240) 


where a is to be determined. Since the scale factors now commute with the 
derivatives, we can secure the invariance of the action for certain / which satisfy, 


QH Q- = 1 = R'e, (9.241) 
which solves to give a = att — 1, and hence, 
1 
ae ee (9.242) 
(n+1)/2-1 


For n = 3, l = 4 solves this; for n = 2,/ = 6 solves this; and for n = 1, it is not 
solved for any / since the field is dimensionless. We therefore have the globally 
scale-invariant potentials in table 9.2. 


9.8 Breaking spacetime symmetry 


The breakdown of a symmetry means that a constraint on the uniformity of 
a system is lost. This sometimes happens if systems develop structure. For 
example, if a uniformly homogeneous system suddenly becomes lumpy, perhaps 
because of a phase transition, then translational symmetry will be lost. If a 
uniform external magnetic field is applied to a system, rotational invariance 
is lost. When effects like these occur, one or more symmetry generators 
are effectively lost, together with the effect on any associated eigenvalues 
of the symmetry group. In a sense, the loss of a constraint opens up the 
possibility of more freedom or more variety in the system. In the opposite 
sense, it restricts the type of transformations which leave the system unchanged. 
Symmetry breakdown is often associated with the lifting of degeneracy of group 
eigenvalues, or quantum numbers. 

There is another sense in which symmetry is said to be broken. Some 
calculational procedures break symmetries in the sense that they invalidate the 
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assumptions of the original symmetry. For example, the imposition of periodic 
boundary conditions on a field in a crystal lattice is sometimes said to break 
Lorentz invariance, 


yx + L) = yx). (9.243) 


The existence of a topological property such as periodicity does not itself break 
the Lorentz symmetry. If there is a loss of homogeneity, then translational 
invariance would be lost, but eqn. (9.243) does not imply this in any way: it 
is purely an identification of points in the system at which the wavefunction 
should have a given value. The field still transforms faithfully as a spacetime 
scalar. However, the condition in eqn. (9.243) does invalidate the assumptions 
of Lorentz invariance because the periodicity length L is a constant and we know 
that a boost in the direction of that periodicity would cause a length contraction. 
In other words, the fact that the boundary conditions themselves are stated in a 
way which is not covariant invalidates the underlying symmetry. 

Another example is the imposition of a finite temperature scale B = 1/kT. 
This is related to the last example because, in the Euclidean representation, a 
finite temperature system is represented as being periodic in imaginary time 
(see section 6.1.5). But whether we use imaginary time or not, the idea of 
a constant temperature is also a non-covariant concept. If we start in a heat 
bath and perform a boost, the temperature will appear to change because of 
the Doppler shift. Radiation will be red- and blue-shifted in the direction of 
travel, and thus it is only meaningful to measure a temperature at right angles to 
the direction of travel. Again, the assumption of constant temperature does not 
break any symmetry of spacetime, but the ignorance of the fact that temperature 
is a function of the motion leads to a contradiction. 


These last examples cannot be regarded as a breakdown of symmetry, because 
they are not properties of the system which are lost, they are only a violation of 
symmetry by the assumptions of a calculational procedure. 


9.9 Example: Navier-Stokes equations 


Consider the action for the velocity field: 
1 ; zs : 
S=t fæ [Zouo + pv'v! (Df vk) + O + s , (9.244) 


where 
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and 


1 (0,0 
D,=0+T=0+-= = 
2\ e 


Di, = 36" a + TE = 8/8" at + EL aalo), (0.246) 
i i i j v 
Dv! eos 
pP Di + (0;P)— uV = F', (9.247) 


where P is the pressure and F is a generalized force. This might be the effect 
of gravity or an electric field in the case of a charged fluid. 

These connections result from the spacetime dependence of the coordinate 
transformation. They imply that our transformation belongs to the conformal 
group rather than the Galilean group, and thus we end up with connection terms 


Dui 


ae ee viðv’, (9.248) 


where 
d,N" = 0 (9.249) 


and N” = (N, Nv’). 
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Kinematical and dynamical transformations 


In addition to parameter symmetries, which express geometrical uniformity in 
spacetime, some symmetries relate to uniformities in the more abstract space 
of the dynamical variables themselves. These ‘internal’ symmetries can contain 
group elements which depend on the spacetime parameters, so that there is a 
cross-dependence on the internal and external parameters; they are intimately 
connected to the concept of ‘charge’ (see also chapter 12). 

Internal symmetries are not necessarily divorced from geometrical (parame- 
ter) invariances, but they may be formulated independently of them. The link 
between the two is forged by the spacetime properties of the action principle, 
through interactions between fields which act as generators for the symmetries 
(see, for instance, section 11.5). 


10.1 Global or rigid symmetries 


The simplest symmetries are global symmetries, whose properties are indepen- 
dent of spacetime location. For example, the action 


1 l 32 
S= fæ E + an $ | (10.1) 


is invariant under the Z> reflection symmetry (x) —> —¢ (x) at all spacetime 
points. This symmetry would be broken by a term of the form 


1 ee ee ae eae 
S= fæ Dag) + z" p“ + ait? | : (10.2) 


The next most commonly identified symmetry is the U(1) phase symmetry, 
which is exhibited by complex fields: 


® > e’. (10.3) 
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The action 
S= fæ {5"0,0 + sora | (10.4) 


is invariant under this transformation. This symmetry is related to the idea of 
electric charge. One can say that charge is a book-keeping parameter for this 
underlying symmetry, or vice versa. 

Multi-component fields also possess global symmetries. For instance, the 
model 


S= fæ EAA + mona (10.5) 
is invariant under the transformation 
ba = Ug" bp. (10.6) 
where 
UE Sei. (10.7) 


or UTU = I. This is the group of orthogonal transformations O(N), where 
A, B =1,..., N. Multi-level atom bound states can be represented in this way, 
see, for instance, section 10.6.3. Multi-component symmetries of this kind are 
form groups which are generally non-Abelian (see chapter 23 for further details 
on the formulation of non-Abelian field theory). 

The physical significance of global symmetries is not always clear a priori. 
They represent global correlations of properties over the whole of spacetime 
simultaneously, which apparently contradicts special relativity. Often the 
analysis of global symmetries is only a prelude to studying local ones. Even 
in section 10.6.3, the global symmetry appears only as a special case of a larger 
local symmetry. One often finds connections between spacetime symmetries and 
phase symmetries which make local symmetries more natural. This is especially 
true in curved spacetime or inhomogeneous systems. 

In practice, global symmetries are mainly used in non-relativistic, small 
systems where simultaneity is not an issue, but there is a lingering suspicion 
that global symmetries are only approximations to more complex local ones. 


10.2 Local symmetries 


A symmetry is called local if it involves transformations which depend on 
coordinates. Allowing a phase transformation to depend on the coordinates is 
sometimes referred to as ‘gauging the symmetry’. For example, the local version 
of the complex U(1) symmetry is 


@ > Clg 
Fi) —> Ta — (3,0). (10.8) 
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The action now needs to be modified in order to account for the fact that partial 
derivatives do not commute with these transformations. The partial derivative is 
exchanged for a covariant one, which includes the connection I’, (x), 


D, = 9, iL}. (10.9) 


1 * 1 2 * 
s= fa sre D,D) + sm? OO ¢ (10.10) 


The most important way in which abstract field symmetries connect with 
spacetime properties is through the derivative operator, since this is the generator 
of dynamical behaviour in continuous, holonomic systems. 


10.3 Derivatives with a physical interpretation 


Covariance with respect to local symmetries of the action may be made manifest 
by re-writing the action in terms of an effective derivative. The physical 
motivation for this procedure is that the ordinary partial derivative does not 
have an invariant physical interpretation under local symmetry transformations. 
By adding additional terms, called ‘connections’, to a partial derivative 0,,, one 
creates an ‘effective derivative’, D,,, which does have an invariant meaning. 
Although the definition of a new object, D,,, is essentially a notational matter, 
the notation is important because it assigns a unique interpretation to the new 
derivative symbol, in any basis. For that reason, D, is called a covariant 
derivative. 

There are two related issues in defining derivatives which have a physical 
interpretation. The first issue has to do with the physical assumption that mea- 
surable quantities are associated with Hermitian operators (Hermitian operators 
have real eigenvalues). The second has to do with form invariance under specific 
transformations. 


10.3.1 Hermiticity 


According to the standard interpretation of quantum mechanics, physical quan- 
tities are derived from Hermitian operators, since Hermitian operators have real 
eigenvalues. Hermitian operators are self-adjoint with respect to the scalar 
product: 


(P1Olp) = (O'$, $) = ($, Og), (10.11) 
or formally 


O'=0. (10.12) 
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If the operator O is a derivative operator, it can be moved from the left hand side 
of the inner product to the right hand side and back by partial integration. This 
follows from the definition of the inner product. For example, in the case of the 
Schrödinger field, we have 


Cen = J aei eo 


= J E 
= (ii, W). (10.13) 


Partial integration moves the derivative from y7% to yı and changes the sign. This 
sign change means that id,, is not a Hermitian operator. In order for a derivative 
operator to be Hermitian, it must not change sign. Thus, a quadratic derivative, 
a7, would be Hermitian. For linear derivatives, we should symmetrize the left- 
right nature of the derivative. Using arrow notation to show the direction in 
which the derivative acts, we may write 


(10.14) 


<> 
Partial integration preserves the sign of 0,.. 

A second important situation occurs when this straightforward partial integra- 
tion is obstructed by a multiplying function. This is commonly the situation for 
actions in curvilinear coordinates where the Jacobian in the volume measure is 
a function of the coordinates themselves. The same thing occurs in momentum 
space. To see this, we note that the volume measure in the inner product is 


do = |J(x)|d"x, (10.15) 


where J(x) is the Jacobian of the coordinates relative to a Cartesian basis. 
Normally, J (x) = ./gij(«), where g;;(x) is the spatial metric. If we now try to 
integrate by parts with this volume measure, we pick up an extra term involving 
the derivative of this function: 


[eowi (10, W2) = fo (=i, — (2) yi y. (10.16) 


This problem affects x derivatives in curved x coordinates and k derivatives in 
Fourier transform space, on the ‘mass shell’. See table 10.1. 

The partial derivatives in table 10.1 are clearly not Hermitian. The problem 
now is the extra term owing to the coordinate-dependent measure. We can solve 
this problem by introducing an extra term, called a ‘connection’, which makes 
the derivative have the right properties under integration by parts. The crux 
of the matter is to find a linear derivative operator which changes sign under 
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Table 10.1. Derivatives and measures. 


Derivative Measure 


On y gij(x)d"x 


R d"k 
kn 2w(k) 


integration by parts, but does not pick up any new terms. Then we are back 
to the first example above, and further symmetrization is trivial. Consider the 
spacetime derivative. The problem will be solved if we define a new derivative 
by 


Du = On + Vy, (10.17) 


and demand that I’,, be determined by requiring that D, only change sign under 
partial integration: 


fesse $ı (D 2) = J| EIO Duono (10.18) 
Substituting eqn. (10.17) into eqn. (10.18), we find that T, must satisfy 
—(0,J) + MT, =—MYI,, (10.19) 
or 
1 ð J 
Zay 10.20 
Dap wie 


The new derivative D, can be used to construct symmetrical derivatives such as 


D? = D,D” and D,,, by analogy with the partial derivative. 


10.3.2 Commutativity with transformations 


The problem of additional terms arising due to the presence of functions of 
the coordinates occurs not just with the integration measure but also with 
transformations of the fields. Imagine a field theory involving the field variable 
(x), a simple scalar field satisfying an equation of motion given by 


-O ¢ = —3, 3" $ =0. (10.21) 
We then consider the transformation 


p(x) > o(x)U (x), (10.22) 
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where U (x) is an arbitrary function of x. This situation crops up quite often in 
field theory, when U(x) is a phase transformation. The first thing we notice is 
that our equation of motion (10.21) is neither covariant nor invariant under this 
transformation, since 


dup > (Lp xU (x) + (0,U œx) (x). (10.23) 


Clearly eqn. (10.21) is only a special case of the equations of motion. Under 
a transformation we will always pick up new terms, as in eqn. (10.23), since 
the partial derivative does not commute with an arbitrary function U(x), so 
U(x) can never be cancelled out of the equations. But, suppose we re-write 
eqn. (10.23) as 


o U 
ppU (x)) = U(x) g + +2) p(x), (10.24) 


and define a new derivative 
D, = (a +T), (10.25) 
where r, = U~! (3 U) = 0, ln U, then we have 
ðu (U (œx) (x)) = U œ) D, ($ &)). (10.26) 


We can now try to make eqn. (10.21) covariant. We replace the partial derivative 
by a covariant one, giving 


—0°o(x) = —D,, D(x) = 0. (10.27) 
The covariance can be checked by applying the transformation 
—D*(U œp &)) = -U 8 (P(x) = 0 (10.28) 


so that the factor of U (x) can now be cancelled from both sides. 

At this point, it almost looks as though we have achieved an invariance in 
the form of the equations, but that is not so. To begin with, the derivative we 
introduced only works for a specific function U (x), and that function is actually 
buried in the definition of the new derivative, so all we have done is to re-write 
the equation in a new notation. If we change the function, we must also change 
the derivative. Also, if we add a source to the right hand side of the equations, 
then this argument breaks down. In other words, while the equation is now 
written in a more elegant way, it is neither covariant nor invariant since the 
specific values of the terms must still change from case to case. 


10.3.3 Form-invariant derivatives 


To obtain invariance requires another idea — and this involves a physical 
assumption. Instead of defining T, = UT! (ð U), we say that T, is itself a new 
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physical field; in addition, we demand that the transformation law be extended 
to include a transformation of the new field I’,,. The new transformation rule is 
then 


p(x) > Upa) 
pe A (10.29) 


T„ might be zero in some basis, but not always. Under this new assumption, 
only the physical fields transform. The covariant derivative is form-invariant, as 
are the equations of motion, since I, absorbs the extra term which is picked up 
by the partial differentiation. 

Note how this last step is a physical assumption. Whereas everything 
leading up to eqn. (10.28) has simply been a mathematical manipulation of 
the formulation, the assumption that I’, is a new field, which transforms 
separately, is a physical assumption. This makes symmetries of this type 
dynamical symmetries, rather than coincidental kinematical symmetries, which 
arise simply as a matter of fortuitous cancellations. 

The covariant derivative crops up in several guises — most immediately in 
connection with the interaction of matter with the electromagnetic field, and the 
invariance of probabilities under arbitrary choice of quantum phase. 


10.4 Charge conjugation 


A charge conjugation transformation, for a field with sufficient internal symme- 
try, is defined to be one which has the following properties on spin 0, L, and 1 
fields: 


Co(x)C* = ng p a) 


CPC = my a) 
C A jC = —A,. (10.30) 


Under this transformation, the sign of the gauge field (and hence the sign of the 
charge it represents) is reversed. It is clearly a discrete rather than a continuous 
transformation. In the complex scalar case, the transformation simply exchanges 
the conjugate pair of fields. This is easy to see in the formulation of the 
complex scalar as a pair of real fields (see section 19.7), where the field, 
A „ is accompanied by the anti-symmetric tensor €4g, which clearly changes 
sign on interchange of scalar field components. In the Dirac spinor case, a 
more complicated transformation is dictated by the Dirac matrices (see section 
20.3.4). 
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10.5 TCP invariance 


The TCP theorem [87, 88, 105, 114] asserts that any local physical Lagrangian 
must be invariant under the combined action of time reversal (T), parity (P) 
and charge conjugation (C). More specifically, it claims that the effect of CP 
should be the same as T. Interactions may be constructed which violate these 
symmetries separately, but the TCP theorem requires the product of these 
transformations 


Utcp = UrUpUc (10.31) 
to be conserved: 
Urcp $ (x) Urep = nemnp 6" (—x) 
Urep W(x) Ue = —Ysnennp W*(—x) 
Urcp Ay (x) Uep = —Nemmp A}, (—x). (10.32) 


A choice of phase such that 7.717) = 1 is natural. This transformation 
has particularly interesting consequence in the case of a spin-+ field. If one 
considers a bi-linear term in the action, of the form 


AL = V(x) Opl), (10.33) 
then the application of the transformation leads to 
Urer [Wy (x) O(x) Wa(x)] Ucp = Uro [yiy’ O(x)W2(x)] Ucp 
[yO ysy’ Owy] 
-Fi (-2) 75 Owy W2(—¥)] 


[Wi(—x) ys Ox) yso(—x) I". 
(10.34) 


In the last two lines, a minus sign appears first when commuting ys through y°, 
then a second minus sign must be associated with commuting yı and w2. Under 
the combination of TCP, one also has scalar behaviour 


ys O(x)y¥s = —O(—x). (10.35) 


Regardless of what one chooses to view as fundamental, the invariance under 
TCP and the anti-commutativity of the Dirac field go hand in hand 


Urer [0 1(x) OC) Wa (a)] Ure = Wy Cx) ax) a). (10.36) 


What is noteworthy about the TCP theorem is that it relates environmental, 
spacetime symmetries (space and time reflection) to internal degrees of freedom 
(charge reflection). This result follows from the locality and Hermiticity of the 
action, but requires also a new result: the spin-statistics theorem, namely that 
spin-4 particles must anti-commute. This means that fermionic variables should 
be represented by anti-commuting Grassman variables. 
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10.6 Examples 


The following examples show how symmetry requirements and covariance 
determine the structure of the action under both internal and spacetime sym- 
metries. The link between spacetime and internal symmetry, brought markedly 
to bear in the TCP theorem, is also reflected through conformal symmetry and 
transformation connections. 


10.6.1 Gauge invariance: electromagnetism 


The Schrödinger equation has the form 


2 
(-s.a + v) y = id,v. (10.37) 
2m 


The wavefunction w(x) is not a direct physical observable of this equation. 
However, the probability 


P = |y}? (10.38) 


is observable. As the modulus of a complex number, the probability is invariant 
under phase transformations of the form 


W(x) > et w(x). (10.39) 


One expects that the Schrödinger action should be invariant under this symmetry 
too. It should be clear from the discussion in section 10.3 that this is not 
the case as long as the phase @(x) is x-dependent; to make the Schrödinger 
equation invariant, we must introduce a new field, A,. By appealing to the 
phenomenology of the Aharonov-Bohm effect, one can identify A, with the 
electromagnetic vector potential. 

From eqn. (2.44), one may assume the following form for the covariant 
derivative: 


—iħð, > —iħD, = —iħ (a, = 15 Ay) , (10.40) 


since it only differs from a completely general expression by some constants c, fi 
and e. In explicit terms, we have chosen T, = —i;-A,. The total gauge or phase 
transformation is now a combination of eqns. (10.37) and (10.39), and to secure 
invariance of the equation, we must perform both transformations together. 

Applying the phase transformation and demanding that D,, commute with the 
phase leads to 


Dy (COO w(x)) = e8 (fo. = iF (An + 7) + 00) w(x), 
=D (w(x) (10.41) 
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where D, = 0, — i; A, and the last line follows provided we take 
i(3,,0) — i= (9,8) =0. (10.42) 


Both 6 (x) and s(x) are completely arbitrary scalar fields, so this relation merely 
identifies them to be the same arbitrary quantity. We may therefore write the 
combined phase and gauge transformations in the final form 


v(x) > W(x) = eiO ya) 
Ay(x) > Ai, (x) = Ay) + (3 s (x)), (10.43) 


and Schrédinger’s equation in gauge-invariant form is 


2 
(- oo, + v) W(x) =ihD,w, (10.44) 
2m 


where D, = cDo. In terms of the covariant derivative, we can write the field 
strength tensor as a commutator: 


[D,, Dy] = -2i Fu (10.45) 


This may be compared with eqn. (10.58) in the following section. 


10.6.2 Lorentz invariance: gravity 


In the presence of a non-trivial metric g,,,, i.e. in the curved spacetime of a 
gravitational field, or in a curvilinear coordinate system, the Lorentz transfor- 
mation is not merely a passive kinematic transformation, it has the appearance 
of a dynamical transformation. This change of character is accompanied by the 
need for a transforming connection, like the ones above, only now using a more 
complex rule, fit for general tensor fields. 

The Lorentz-covariant derivative is usually written V,,, so that covariance is 
obtained by substituting partial derivatives in the following manner: 


T (10.46) 


With Lorentz transformations there is a subtlety, since we are interested in many 
different representations of the Lorentz group, i.e. in tensors of different rank. 
For scalar fields, there is no problem for Lorentz transformations. A scalar field 
does not transform under a Lorentz transformation, so the partial derivative is 
Hermitian. In other words, 


Vao (x) = ð h x). (10.47) 
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For a vector field, however, the story is different. Now the problem is that 
vectors transform according to the rules of tensor transformations and the partial 
derivative of a vector field does not commute with Lorentz transformations 
themselves. To fix this, a connection is required.' As before, we look for a 
connection which makes the derivative commute with the transformation law. 
Consider the vector field V,,. Let us transform it from one set of coordinates, 
E% EÊ, to another, x”, x”. According to the rules of tensor transformation, we 
have 


arb 
vie) = ma 


ox. 


= (8, Ef) VC) 
= Lf Vg). (10.48) 


Let us now introduce a derivative V,, with the property that 
V(LV) = L(V'V), (10.49) 


i.e. such that the derivative V,, is form-invariant, but transforms dynamically 
under a coordinate transformation. Let us write 


Vu = On + Tur (10.50) 


where the question mark is to be determined. At this stage, it is not clear just 
how the indices will be arranged on I’, since there are several possibilities when 
acting on a vector field. Let us evaluate 


Vu, Œ) = V, (LFV) 


EA G grava) 


= (ðu +T) G grava) 
= (33v) Vg (x) + (OEP (O.Vp) + Tu (0E) Vg. (10.51) 
From the assumed form of the transformation, we expect this to be 
LPV, Vg) = (On EP), + 1) Vp- (10.52) 
Comparing eqn. (10.51) and eqn. (10.52), we see that 
(A, EAE p > (0, EPT, — (.,€*). (10.53) 
1 There are two ways to derive the connection for Lorentz transformations, one is to look at 


the Hermitian nature of the derivatives; the other is to demand that the derivative of a tensor 
always be a tensor. Either way, one arrives at the same answer, for essentially the same reason. 
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E 
Multiplying through by (d, x”) and using the chain-rule, we see that the 
transformation of I’ must be 


r > T — By x”), EP). (10.54) 


This also shows us that there must be three indices on T, so that the correct 
formulation of the vector-covariant derivative is 


Viv = 3p Vo — Thy Vas (10.55) 


with transformation rule 
A x x 
r, > Thu — (Oa x”) ð EP). (10.56) 


Thus, demanding commutativity with a dynamical transformation, once again 
requires the introduction of a corrective term, or connection. 

What turns a coordinate transformation into a dynamical transformation is 
the spacetime dependence of the metric. It makes the coordinate transformation 
into a spacetime-dependent quantity also, changing its status from a passive 
kinematical property to an active dynamical one. The non-linearity which is 
implied by having coordinates which depend on other coordinates is what leads 
Einstein’s theory of gravity to use the concept of intrinsic curvature. 

The above procedure can be generalized to any tensor field. Extra terms will 
be picked up for each index, since there is a coordinate transformation term 
for each index of a tensor. The sign of the correction depends on whether 
indices are raised or lowered, because of the mutually reciprocal nature of the 
transformations in these cases. To summarize, we have spacetime-covariant 
derivatives defined as follows: 


Vao (X) = dpp a) 
Vp Av = ð Av — Th, Aa 
VA” = ðL A” +T jaAa 
Vil = pT HTETI ATST TTW. (10.57) 
Note that we can express the curvature as a commutator of covariant derivatives: 


[Vu Vol? = —R*, és. (10.58) 


ouv 


This may be compared with eqn. (10.45). 


10.6.3 The two-level atom in a strong radiation field 


It was first realized by Jaynes and Cummings that a semi-classical model of a 
two-level atom could reproduce the essential features of the quantum theoretical 
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problem [79]. The two-level system has a broad repertoire of applications in 
physics, from spin models to the micromaser [91]. It is related to a class of 
Dicke models [37, 57], and, in the so-called rotating wave approximation, it 
becomes the Jaynes-Cummings model [79] which may be solved exactly. A 
Hamiltonian analysis of symmetries in this Jaynes-Cummings model is given in 
ref. [7]. 

The symmetry techniques and principles of covariant field theory can be 
applied to the two-level atom to solve the full model and eliminate the need 
for the so-called rotating wave approximation. Consider the phenomenological 
two-level system described by the action 


a 
f= feo- onrar — WaVasOWe 


ih 
+ aV DN = or (10.59) 
where A, B = 1,2 characterizes the two levels, i4D,; = iho, + il (t) in 
matrix notation, and T = T 4g is an off-diagonal anti-symmetrical matrix. 


At frequencies which are small compared with the light-size of the atom, an 
atom may be considered electrically neutral. The distribution of charge within 
the atoms is not required here. In this approximation the leading interaction 
is a resonant dipole transition. The connection "4g plays an analogous role 
to the electromagnetic vector potential in electrodynamics, but it possesses no 
dynamics of its own. Rather, it works as a constraint variable, or auxiliary 
Lagrange multiplier field. There is no electromagnetic vector potential in the 
action, since the field is electrically neutral in this formulation. T 4g refers not 
to the U(1) phase symmetry but to the two-level symmetry. Variation of the 
action with respect to I (t) provides us with the conserved current. 


ôS ere i 
a = z (Mae — Wea), (10.60) 
blag 2 
which represents the amplitude for stimulated transition between the levels. The 
current generated by this connection is conserved only on average, since we are 
not taking into account any back-reaction. The conservation law corresponds 


merely to 
ô 
Or ( 2 ) x sin (2/ xo). (10.61) 


where X (t) will be defined later. The potential V4g(t) is time-dependent, and 
comprises the effect of the level splitting as well as a perturbation mediated 
by the radiation field. A ‘connection’ 2; = —I'y is introduced since the 
diagonalization procedure requires a time-dependent unitary transformation, 
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and thus general covariance demands that this will transform in a different basis. 
The physics of the model depends on the initial value of this ‘connection’, and 
this is the key to the trivial solubility of the Jaynes-Cummings model. 

In matrix form we may write the action for the matter fields 


S= f (x) WiOsnbn (10.62) 
where 
rv? ih : 
—-— — — “AD J (t Tr 
-| 2m . ee oe A P | (10.63) 
J(t) — iP 12 -z = Ve Dy 


The level potentials may be regarded as constants in the effective theory. They 
are given by V; = E; and V) = E — AQkgr where fAQe is the interaction 
energy imparted by the photon during the transition, i.e. the continuous radiation 
pressure on the atom. In the effective theory, we must add this by hand, since we 
have separated the levels into independent fields which are electrically neutral; 
it would follow automatically in a complete microscopic theory. The quantum 
content of this model is now that this recoil energy is a quantized unit of AQ, 
the energy of a photon at the frequency of the source. Also, the amplitude of 
the source, J, would be quantized and proportional to the number of photons 
on the field. If one switches off the source (which models the photon’s electric 
field), this radiation energy does not automatically go to zero, so this form is 
applicable mainly to continuous operation (stimulation). The origin of the recoil 
is clear, however: it is the electromagnetic force’s interaction with the electron, 
transmitted to the nucleus by binding forces. What we are approximating is 
clearly a J” A „ term for the electron, with neutralizing background charge. 

It is now desirable to perform a unitary transformation on the action Yy —> 
Uy, O —> UOU™—, which diagonalizes the operator ©. Clearly, the connection 
Tag will transform under this procedure by 


ih -1 =i 
TEEDE (U(0,U~') — (a,U)U~") (10.64) 


since a time-dependent transformation is required to effect the diagonalization. 
; ; td x 292 ce 

For notational simplicity we define L = i — xh D,, so that the secular 

equation for the action is: 


(Ê — Ore) (eee Seer AQ — à) — (J? +T} ) ee (10.65) 


<> 
Note that since J 0, J = 0 there are no operator difficulties with this equation. 
The eigenvalues are thus 


Á z [cs 
jpc le En thos | (En — AQ) +J2+r?} (10.66) 
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=- En +Q Jra? +J? +T (10.67) 
= Î-— En +Q + har, (10.68) 
where Ep = (E; + E2) and É; = (E — E). For notational simplicity 
we define @ and œg. One may now confirm this procedure by looking for the 


eigenvectors and constructing U~! as the matrix of these eigenvectors. This may 
be written in the form 


—ı _ { cos@ —sin@ 
res ( sin cosé i} (10.07) 
where 
ACG 
cos0 = (OF OR) (10.70) 


(Po + or? +J? +T? 


V?+lp 


si 0 = ————— 
(Pe + or? +J? +T? 


(10.71) 


The change in the connection T (t) is thus off-diagonal and anti-symmetric, as 
required by the gauge symmetry conservation law: 


(ets ( =o ~ ). (10.72) 


The time derivative of 0(t) may be written in one of two forms, which must 
agree 


0, cos 0 0, sin 8 
(0,8) = = 


= . 10.73 
— sind cos 0 ( ) 


This provides a consistency condition, which may be verified, and leads to the 
proof of the identities 


OpRO;Onp = J OJ +r gr (10.74) 
and 


VJ? +T; + AV J2 +T? + (@ + or) + A)(@ + og) = 0 
(10.75) 


for arbitrary J (t) and T (t), where 


1 0, ((@ + ar)? + J? +T?) 
2 (+r? +I? +r? 


A= (10.76) 
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These relations are suggestive of a conformal nature to the transformation and, 
with a little manipulation using the identities, one evaluates 


T12/ = (0,0) = 


(JəæJ +r 3T) f — (Õ+or)(Õ+ 20r) | 
orv J? +T? (@ + aotr] 
(10.77) 


This quantity vanishes when J? + T? is constant with respect to time. Owing 
to the various identities, the result presented here can be expressed in many 
equivalent forms. In particular, it is zero when & = 0. The equations of motion 
for the transformed fields are now 


L = E2 + hor 10,0 We 
Re de =0. 10.78 
| —10,0 D= Ey = hop | ( y- ( ) 


In this basis, the centre of mass motion of the neutral atoms factorizes from 
the wavefunction, since a neutral atom in an electromagnetic field is free on 
average. The two equations in the matrix above may therefore be unravelled by 
introducing a ‘gauge transformation’, or ‘integrating factor’, 


Wa (x) = eti Jo X(t/)dt’ V(x), (10.79) 
where the free wavefunction in n = 3 dimensions is 
= dw dk . 
a Beers emus 10.80 
o= aoa (x) (10.80) 


is a general linear combination of plane waves satisfying the dispersion relation 
for centre of mass motion 


hak? = 
X = — +AQ-o)- En = 0. (10.81) 

2m 
The latter is enforced by the delta function. This curious mixture of continuous 
(œw) and discontinuous (Q2) belies the effective nature of the model and the 
fact that its validity is only for a continuous operation (an eternally sinusoidal 
radiation source which never starts or stops). The relevance of the model is thus 
limited by this. Substituting this form, we identify X (t) as the integrating factor 
for the uncoupled differential equations. The complete solution is therefore 


wax) = eih R+B (y), (10.82) 


Notice that this result is an exact solution, in the sense of being in a closed form. 
In the language of a gauge theory this result is gauge-dependent. This is because 
our original theory was not invariant under time-dependent transformations. 
The covariant procedure we have applied is simply a method to transform the 
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equations into an appealing form; it does not imply invariance of the results 
under a wide class of sources. 

That this system undergoes transitions in time may be seen by constructing 
wavefunctions which satisfy the boundary conditions where the probability of 
being in one definite state of the system is zero at t = 0. To this end we write 
Yı = (v4 + W_) and Yo = (Ws. — W_). In order to proceed beyond this 
point, it becomes necessary to specify the initial value of "12. This choice carries 
with it physical consequences; the model is not invariant under this choice. 
The obvious first choice is to set this to zero. This would correspond to not 
making the rotating wave approximation in the usual two-level atom, with a 
cosine perturbation. Focusing on the state Yo which was unoccupied at t = 0 
for T2 = 0, 


t 
Wo = sin ( | dr’ V @ +h? JE cos?(Qt’) 
0 


-_ JoQsin(Qr') [. Jê meee — 
210 e 22m . (10. 
O oe E E o w(x). (10.83) 


We are interested in the period, and the amplitude of this quantity, whose 
squared norm may be interpreted as the probability of finding the system in 
the prepared state, given that it was not there at t = 0. Although the integral 
is then difficult to perform exactly, it is possible to express it in terms of 
Jacobian elliptic integrals, logarithms and trig functions. Nevertheless it is clear 
that © = 4 (Ex /h — Q) is the decisive variable. When ha® «&« Jo is small, 
the first term is Jọ cos(Qt) and the second term is small. This is resonance, 
although the form of the solution is perhaps unexpected. The form of the 
wavefunction guarantees a normalized result which is regular at @ = 0, and one 


has Yo ~ sin ( is dt’ fo cos(&r')), which may be compared with the standard 


result of the Jaynes-Cummings model Yo ~ sin(Jot/A). In the quantum case 
the amplitude of the radiation source, Jo, is quantized as an integral number, No, 
of photons of frequency (2. Here we see modulation of the rate of oscillation by 
the photon frequency (or equivalently the level spacing). In a typical system, the 
photon frequency is several tens of orders of magnitude larger than the coupling 
strength Jo < AQ ~ Ey and thus there is an extremely rapid modulation of 
the wavefunction. This results in an almost chaotic collapse—revival behaviour 
with no discernible pattern, far from the calm sinusoidal Rabi oscillations of the 
Jaynes-Cummings model. If ia ~ Jo, the second term is of order unity, and 
then, defining the normalized resonant amplitude 


J 
A= US (10.84) 


Jhew + Je 
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one has 


_ [ J2 sin(Qr) = 
Yo ~ sin | —— E (Qt, A) — A faen W(x). 
( A V1 -— A? ee 


(10.85) 


The Jacobian elliptical integral E(a, 6) is a doubly periodic function, so one 
could expect qualitatively different behaviour away from resonance. On the 
other hand, far from resonance, i@ >> Jo, the leading term of the connection 
becomes Yo ~ sin(@t) W(x) ~ sin(Qt) w(x), and the effect of the level 
spacing is washed out. 

One can also consider other values for the connection. Comparing T12 to 
the off-diagonal sources y“ D,,, predicted on the basis of unitarity in effective 
non-equilibrium field theory [13], one obtains an indication that, if the initial 
connection is in phase with the time derivative of the perturbation, then one 
can effectively “re-sum’ the decay processes using the connection. This is a 
back-reaction effect of the time-dependent perturbation, or a renormalization 
in the language of ref. [13]. If one chooses Ti? = Josin(2r), this has the 
effect of making the off-diagonal terms in the action not merely cosines but 
a complex conjugate pair Jo exp(+iQr). This corresponds to the result one 
obtains from making the rotating wave approximation near resonance. This 
initial configuration is extremely special. With this choice, one has exactly 


Wo = sin o dt’ [v + na|) Wx). (10.86) 


The stability of the solution is noteworthy, and the diagonalizing transformation 
is rendered trivial. The connection 0,6 is now zero under the diagonalizing 
transformation. Thus, the above result is exact, and it is the standard result of 
the approximated Jaynes-Cummings model. This indicates that the validity of 
the Jaynes-Cummings model does not depend directly on its approximation, but 
rather on the implicit choice of a connection. 


10.7 Global symmetry breaking” 


The dynamical properties of certain interacting fields lead to solution surfaces 
whose stable minima favour field configurations, which are ordered, over 
random ones. Such fields are said to display the phenomenon of spontaneous 
ordering, or spontaneous symmetry breaking. This is a phenomenon in which 
the average behaviour of the field, in spite of all its fluctuations, is locked into 
a sub-set of its potential behaviour, with less symmetry. A classic example of 


2 ñ = c = uo = €& = 1 in this section. 
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this is the alignment of spin in ferromagnetism, in which rotational symmetry is 
broken into a linear alignment. 

Spontaneous symmetry breaking can be discussed entirely within the frame- 
work of classical field theory, but it should be noted that its dependence on 
interactions raises the problem of negative energies and probabilities, which is 
only fully resolved in the quantum theory of fields. 

When a continuous global symmetry is broken (i.e. when its average state 
does not express the full global symmetry), one sees the appearance of massless 
modes associated with each suppressed symmetric degree of freedom. These 
massless modes are called Nambu—Goldstone bosons [59, 60, 99, 100]. To see 
how they arise, consider the action 


1 O" à 7 
S = fæ EAA +5 papa + 7 (aba) l. (10.87) 


The interaction potential V(¢) = smb + io! has a minimum at 


av@) _ 


A 


iehc? 


À 
mda + Ga (boos) =0. (10.88) 


This would therefore be the equilibrium value for the average field. Note that 
a non-zero value for (@), within a bounded potential A > 0, is possible only 
if m? < 0. Suppose one now considers the effect of fluctuations, or virtual 
processes, in the field. Following the procedure of chapter 6, one may split the 
field into an average (constant) part (#) and a fluctuating (quickly varying) part 


9, 
ba = (b)a + pa. (10.89) 
Expressed in terms of these parts, the terms of the action become: 
Opa) Oupa) > O p) (Ong) 
5m baba > (Pala +AA pa + papa) 


(papa) > ((b)a(b)a) + 4b) a(b) a) (G) BGR) 
+ 2(~a~a) (Oh) B(p) B) + 4(Ga(h) a) (GB (Ob) B) 
+ 4p) apa) (GBGB) + (GaGa). (10.90) 


To quadratic order, the action therefore takes the form 


1 1 oe res 
S = fæ 5 80) n9) + 504 (r + 6?) Jon 


À 
Tralla D) 5) 08 + ; (10.91) 
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If the action is evaluated at the minimum of the potential, substituting for 
the minimum (@),4, the quadratic masslike terms do not vanish, nor is any 
asymmetry created. The action is still invariant under rotations in A, B space, 
with a different mass matrix 1/3() 4(@) g. However, if one postulates that it is 
favourable to select a particular combination for (¢) 4, e.g. let A, B = 1,2 and 
(fp); = 0, (¢)2 = (¢), thus breaking the symmetry between degenerate choices, 
then the quadratic terms become: 


1 2, Rai 1 24,42 
avi (r + z(ġ) J t59 (r + 5(¢%) Jo (10.92) 
2 6 2 2 

The first of these terms, evaluated at the minimum, vanishes, meaning that gı 
is a massless excitation at the equilibrium solution. It is a Nambu—Goldstone 
boson, which results from the selection of a special direction. The rotational 
A, B symmetry of the fluctuating field g, is still present, but the direction of the 
average field is now chosen at all points. 

In this two-dimensional rotational example, the special direction was chosen 
by hand, using the ad hoc assumption that the scalar field would have an 
energetically favoured ordered state. Clearly, one could have chosen any 
direction (linear combination of ¢, from the rotational invariance), and the 
result would be the same, due to the original symmetry. Since these are all 
equivalent, it takes only the energetic selection of any one of them to lead to an 
ordering, and thus spontaneous symmetry breaking. In the parametrization 


1 


b = — pe” (10.93) 
J2 


the symmetry properties of the action become even more transparent. The action 
is now: 


1 1 À 
S= J (dx) EZD + z" H i" (10.94) 


This, assuming a stable average state  — (po) + p, gives, to quadratic order: 


2 À 2 2 
s= fa {ol- +m + Sto" | p+ (0 A(— + (10.95) 


The radial 0 excitation is clearly massless. This parametrization has presented 
several technical challenges in the quantum theory however, so we shall not 
pursue it in detail. 

The foregoing argument can be generalized to any continuous global group, 
either Abelian or non-Abelian. Suppose that the action 


S= fæ {T (8.64) — Voa} (10.96) 
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is invariant under a symmetry group G, of dimension dg; then, if it is 
energetically favourable for the field to develop a stable average (6), with 
restricted behaviour, such that 


bi > (ìi + Gi (10.97) 


for a sub-set of the components i € A, there must a minimum in the potential, 
such that 


aV 
ap =0. (10.98) 
‘lo=(6) 
The field there splits into two parts: 
(oi tg EH 
ga > | ĝi € G/H (10.99) 


The first part has a stable average and small fluctuations around this value. 
The remainder of the components are unconstrained fluctuations, which are 
orthogonal in the group theoretical sense from the others. For the components 
with non-zero averages, one may expand the potential around the minimum: 


4 3? V 
=(ġ)a  3paðpr 'oa=(@ 


V (oa) = V (ġa) p l ~avpt-:--. (10.100) 


The form and value of the potential are unchanged by a group transformation G, 
since the action is invariant under G. Moreover, by assumption of a minimum, 
one must have 


3V 


Mag = 
dpiIp; !ba=(b)a 


>0. (10.101) 


To determine whether any of the components of this have to be zero, one uses the 
assumption that the average state is invariant under the sub-group H. Invariance 
under H means that 


3V ((p)) 
V (Un (h) = VCA + —— 5a (9) 16H (O); +-+; (10.102) 
IPiIP; 
thus, ôg (6); = 0 and M;, is arbitrary, since the transformation itself is null- 


potent at (¢@). However, if one transforms the average state by an element which 
does not belong to the restricted group H, then dg(%) 4 0, and 


3V 
V(Ug()) = Vd) + Oe bal b)ade(0)s fees. (10,103) 
paðPB 
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Thus, for any A, B which do not belong to i, j, the mass terms M* g = 0 for 
invariance of the potential. These are the massless modes. There are clearly 
dimG/H = dg — dy of these massless elements, which correspond to all of the 
fluctuations which are not constrained by the average state. 

This argument does not depend on whether the group is Abelian or non- 
Abelian (except that the coset dimension G/H does not apply to groups like 
U(1)), only on the fact that a stable average emerges, picking out a special 
direction in group space. Since even a single group generator, corresponding to 
a single component of the field, generates a sub-group, the average field lies in 
a group of its own (the factor group). If the group H is an Abelian sub-group, 
such as Zy, (generated by the Cartan sub-algebra of the full Lie algebra), then 
the resulting factor group shares the same algebra as the full group, only the 
centre of the group is broken. This changes the dimension of the representation, 
but does not change the universal cover group for the symmetry. If H is not an 
Abelian sub-group, then the basic algebra of the symmetry must also change. 

The Nambu-—Goldstone mechanism is a relative suppression of certain fluc- 
tuations, rather than a breakdown of fundamental symmetry. For example, in 
a crystal, with an R” symmetry, the crystal lattice breaks up translations into 
R” /Zy, leading to massless vector fields, which are phonons. 

It is not clear from the above that the choice of symmetry breaking potential 
is actually feasible: it has not been shown that the fluctuations around the 
average state are small enough to sustain the average value that was assumed. 
This requires a more lengthy calculation, using the generating functionals of 
chapter 6. Moreover, unless the result of the calculation can be determined 
entirely by quadratic terms, one is forced to use quantum field theory to 
calculate the expectation values, since there are questions of negative energies 
and probabilities which are only resolved by operator ordering in the second 
quantization. General theorems exist which prohibit the existence of Goldstone 
bosons, due to infra-red divergences, and thus global symmetry breaking in less 
than three spatial dimensions cannot occur by this mechanism [27, 97]. 

The occurrence of spontaneous symmetry breaking assumes that it will be 
possible to find a system in which the effective mass squared in the action is less 
than zero. Clearly no such fundamental fields exist: they would be tachyonic. 
However, composite systems, or systems influenced by external forces, can have 
effective mass-squared terms which have this property. This is exploited in 
heuristic studies of phase transitions, where one often writes the mass term as: 


2 T-T 2 
m? (T) = ( 7 ) ms (10.104) 


which gives rise to a second-order phase transition at critical temperature T, 
(n > 2), i.e. a change from an ordered average state at low temperature to a 
disordered state above the critical temperature. 
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10.8 Local symmetry breaking? 


The predominance of gauge theories in real physical models leads one to ask 
whether symmetry breaking phenomena could occur in local gauge theories. 
Here one finds a subtly different mechanism, originally pointed out by Anderson 
[3], inspired by an observation of Schwinger [117], and rediscovered in the 
context of non-Abelian field theory by Higgs [68, 69, 70]. It is called the 
Anderson—Higgs mechanism, or simply the Higgs mechanism. 

The action for this model is that of a complex scalar field coupled to the 
electromagnetic field. It is sometimes used as a simple Landau—Ginsburg 
model of super-conductivity (see section 12.6). It is also referred to as scalar 
electrodynamics. A straightforward non-Abelian generalization is used in 
connection with the Standard Model; this is discussed in many other references 
[136]. The action in complex form is written 


uÝ 20t À oto? 
S= | (dx) | (DD (D,D) + m°o'o + ZD H- 
TER 
+7F” Faf. (10.105) 


Here we have only written a ®* interaction explicitly, with coupling constant À. 
Other interactions are also possible depending on the criteria for the model. In 
the quantum theory, restrictions about renormalizability exclude higher powers 
of the field in 3 + 1 dimensions. In 2 + 1 dimensions one may add a term 
58 (pip). Odd powers of the fields are precluded by the fact that the action 
must be real. The covariant derivative is usually written D, = 9, + ieA,. The 
conserved current generated by the gauge field A, is therefore 

ôSp J =ie(@' $ 

wag uS ie(®' (D ®) — (D, ®)' Ò). (10.106) 
The action clearly has a basic U (1) symmetry. An alternative form of the action 
is obtained by re-writing the complex field in terms of two real component fields 
ba, where A = 1, 2, as follows: 


1 
V2 


The covariant derivative acting on the fields can then be expanded in real and 
imaginary parts to give 


D(x) = —=(¢1 + if). (10.107) 


Dida = Ia — ecaBhB An. (10.108) 


3 = c = uo = €9 = 1 in this section. 
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The action then takes the more complicated form 
1 
S= [ooo] sarnee0 — e(0"ba)eapApde 


1 2 H À 2 I pv 
+ 3" c€aBEacOBOCA" Ay + gi Paea) + af lane a 
(10.109) 
Expressed in this language, the conserved current becomes 


Ja = € EAB (ġa DOB). (10.110) 


This shows the anti-symmetry of the current with respect to the field components 
in this O (2) formulation. 

Suppose, as before, that one component of the scalar field develops a constant 
non-zero expectation value ¢; —> (@) + g1; the action can be expanded around 
this solution. Once again, this must be justified by an energy calculation to show 
that such a configuration is energetically favourable; is non-trivial and will not 
be discussed here. It is interesting to compare what happens in the presence of 
the Maxwell field with the case in the previous section. The part of the action, 
which is quadratic in 1, $2, A,, is the dynamical part of the fluctuations. It is 
given by 


s= J an |- +m? 16? | fi 


+ 52 |- +m? + 216" + eo a 


+ 2ep1 A", (9)) + 5A [- rewa] (10.111) 


This may be diagonalized with the help of the procedure analogous to 
eqn. (A.11) in Appendix A. The identity 


1 1 1 
zrg + Bh = 5 + BA~')A(g, + A7'B) — 5BA'B 
(10.112) 


with 


A= |- +m? + =($)? = eo 
B = —2e9, A" (ð ()) (10.113) 
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results in an action of the form 


so = Janse |- +m 10] fı 


+ T [C0 + elp) + Gaa] a (10.114) 


where G is a gauge-dependent term. The details of this action are less interesting 
than its general characteristics. Unlike the case of the global symmetry, 
there is only one remaining scalar field component. The component which 
corresponds to the Goldstone boson, disappears in the variable transformations 
and re-appears as a mass for the vector field. The lack of a Goldstone 
boson is also interesting, since it circumvents the problems associated with 
Goldstone bosons in lower dimensions n < 3 [27, 97]. Although it is only 
an idealized effective theory, this local symmetry breaking mechanism indicates 
that symmetry breaking is indeed possible when one relaxes the rigidity of a 
global group. 

The transmutation of the massless scalar excitation into a mass for the vector 
field can be seen even more transparently in the unitary gauge. The unitary 
gauge is effected by the parametrization 


p = — pe” (10.115) 
1 
B, = Ay + 59,8 (10.116) 


so that the action becomes 
ioe 1 Es 
S= dV ae Pee 5 8") uP) + 7° p BY B,, 


ipp + 2y (10.117) 
2 4! 

What looks like a gauge transformation by a phase @ is now a dynamical 
absorption of the Goldstone boson. This is sometimes stated by saying that 
the Goldstone boson is ‘eaten up’ by the gauge field, as if the photon were some 
elementary particular Pacman. A more field theoretical description is to say 
that the Goldstone mode modulates the fluctuations of the electromagnetic field, 
making them move in a wavefront. This wavefront impedes the fluctuations 
by an amount that depends upon the gauge coupling constant e. The result 
is an effective mass for the gauge fluctuations, or a gap in their spectrum of 
excitations. However one states it, the Goldstone field ceases to be a separate 
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excitation due to the coupling: its modulation of the vector field’s zero point 
energy breaks the gauge invariance of the fluctuations and it re-appears, with a 
new status, as the extra mode of the vector field. 

It cannot be emphasized enough that the assumption that there exists a stable 
average state of lower symmetry than the fluctuations of the theory is ad hoc, and 
its consistency has to be proven. Even today, this remains one of the toughest 
challenges for quantum field theory. 


10.9 Dynamical symmetry breaking mechanisms 


The Nambu-—Goldstone or Anderson—Higgs models of symmetry breaking can- 
not be fundamental theories, because they do not explain how the mass-squared 
terms, in their Lagrangians, can become negative. As such, they must be 
regarded as effective actions for deeper theories. Moreover, their apparent 
reliance on the existence of an arbitrary scalar field has been controversial, since, 
in spite of the best efforts of particle physicists, no one has to date observed a 
Higgs scalar particle. The introduction of a scalar field is not the only way in 
which gauge symmetries can be broken, however. At least two other possibilities 
exist. Both rely on quantum dynamical calculations, but can be mentioned 
here. 

One such mechanism was suggested in connection with field theories on 
topologically non-trivial spacetimes (e.g. the torus), based on an idea by 
Ford [52], that non-trivial average states, such as vortices could occur around 
topological singularities in spacetime. The main idea is that a gauge field A, > 
(An) + A, (either Abelian or non-Abelian) can acquire a non-zero expectation 
value around a hole in spacetime. In simply connected spacetimes (without 
holes), such constant vector field configurations are gauge-equivalent to zero 
and thus have no invariant meaning. However, around a topological singularity, 
such transformations are restricted by the cohomology of the manifold. One 
example is that of a periodic crystal, which has the same boundary conditions as 
the surface of a torus, and is therefore relevant in solid state physics. 

In the Abelian theory, the phenomenon is a purely classical, statistical effect, 
though for non-Abelian symmetries the non-linearity makes it the domain of 
quantum field theory. It is equivalent to there being a constant magnetic flux 
through the centre of the hole. In some theories, such expectation values might 
occur spontaneously, by the dynamics of the model (without having to assume 
a negative mass squared ad hoc). In the Abelian case, this results only in a 
phase. However, it was later explored in the context of non-Abelian symmetries 
by Hosotani [72] and Toms [129] and developed further in refs. [17, 19, 20, 21, 
32, 33]. Such models are of particular interest in connection with grand unified 
theories, such as Kaluza—Klein and string theory, where extra dimensions are 
involved. Topological singularities also occur in lower dimensions in the form 
of vortices and the Aharonov-Bohm effect. 
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The second mechanism is the Coleman and Weinberg mechanism [28], which 
is a purely quantum effect for massless fields, whereby a non-trivial average 
state can be created truly spontaneously, by the non-linearities of massless scalar 
electrodynamics. Quantum fluctuations themselves lead to the attainment of an 
ordered state. It is believed that this mechanism leads to a first-order phase 
transition [66, 86], rather than the second-order transitions from the Goldstone 
and Higgs models. 


11 


Position and momentum 


Field theory is ripe with objects referred to colloquially as coordinates and 
momenta. These conjugate pairs play a special role in the dynamical formulation 
but do not necessarily imply any dimensional relationship to actual positions or 
momenta. 


11.1 Position, energy and momentum 


In classical particle mechanics, point particles have a definite position in space 

at a particular time described by a dynamical trajectory x(t). The momentum 
2 

pít = mee, In addition, one has the energy of the particle, 7 + V, as a 

book-keeping parameter for the history of the particle’s momentum transactions. 


In the theory of fields, there is no a priori notion of particles: no variable 
in the theory represents discrete objects with deterministic trajectories; instead 
there is a continuous field covering the whole of space and changing in time. The 
position x is a coordinate parameter, not a dynamical variable. As Schwinger 
puts it, the coordinates in field theory play the role of an abstract measurement 
apparatus [119], a ruler or measuring rod which labels the stage on which the 
field evolves. Table 11.1 summarizes the correspondence. 

The quantum theory is constructed by replacing the classical measures of 
position, momentum and energy with operators satisfying certain commutation 
relations: 


[x, p] = iå (11.1) 

and 
[t, E] = —ih. (11.2) 
These operators have to act on something, and indeed they act on the fields, 


but the momentum and energy are represented by the operators themselves 
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Table 11.1. Dynamical variables. 


Canonical Particle Field 
position mechanics theory 
Parameter space t X,t 

Dynamical variable x(t) (x, t) 


independently of the nature of the fields. Let us see why this must be so. The 
obvious solution to the commutators above is to represent ¢ and x by algebraic 
variables and E and p as differential operators: 


pi = —ihd; 
E = iha,. (11.3) 


If we check the dimensions of these operator expressions, we find that fid; has 
the dimensions of momentum and that fd, has the dimensions of energy. In 
other words, even though these operators have no meaning until they act on 
some field, like this 


pile = —ihoyw 
Ew = iha,y, (11.4) 


it is the operator, or its eigenvalues, which represent the momentum and energy. 
The field itself is merely a carrier of the information, which the operator extracts. 
In this way, it is possible for the classical analogues of energy and momentum, 
by assumption, to be represented by the same operators for all the fields. Thus 
the dimensions of these quantities are correct regardless of the dimensions of 
the field. 

The expectation values of these operators are related to the components of the 
energy-momentum tensor (see section 11.3), 


pic = - f ao% = (pic) 
E, = [20% = (Hp). (11.5) 


Hp is the differential Hamiltonian operator, which through the equations of 
motion is related to ifd,. The relationship does not work for the Klein—Gordon 
field, because it is quadratic in time derivatives. Because of their relationship 
with classical concepts of energy and momentum, FE, and P; may also be 
considered as mechanical energy and momenta. 
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Table 11.2. Canonical pairs for the fields. 


Field ‘Xx’ Ps 
Klein-Gordon ¢ Ac? 3o 
Dirac y yt 
Schrödinger y ihw* 
Maxwell Au Doi 


Separate from these manifestations of mechanical transport are a number of 
other conjugate pairs. The field q itself is a basic variable in field theory, whose 
canonical conjugate oq is often referred to as a conjugate momentum; see 
table 11.2. That these quantities do not have the dimensions of position and 
momentum should be obvious from these expressions; thus, it should be clear 
that they are in no way connected with the mechanical quantities known from the 
classical theory. In classical electrodynamics there is also a notion of ‘hidden’ 
momentum which results from self-interactions [71] in the field. 


11.2 Particles and position 


The word particle is dogged by semantic confusion in the quantum theory of 
matter. The classical meaning of a particle, namely a localized pointlike object 
with mass and definite position, no longer has a primary significance in many 
problems. The quantum theory of fields is often credited with re-discovering 
the particle concept, since it identifies countable, discrete objects with a number 
operator in Fock space. The objects which are counted by this operator are 
really quanta, not particles in the classical sense. They are free, delocalized, 
plane wave objects with infinite extent. This is no problem for physics. In fact, 
it is possible to speak of momentum and energy transfer, without discussing 
the nature of the objects which carry these labels. However, it is sometimes 
important to discuss localizability. 

In spite of their conceptual demotion, it is clear that pointlike particle events 
are measured by detectors on a regular basis and thus have a practical signifi- 
cance. Accordingly, one is interested in determining how sharply it is possible 
to localize a particle in space, i.e. how sharp a peak can the wavefunction, and 
hence the probability, develop? Does this depend on the nature of the field, for 
instance, the other quantum numbers, such as mass and spin? This question 
was asked originally by Wigner and collaborators in the 1940s and answered for 
general mass and spin [6, 101]. 

The localizability of different types of particle depends on the existence of a 
Hermitian position operator which can measure it. This is related to the issue 
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of physical derivatives in section 10.3. Finding such an operator is simple in 
the case of the non-relativistic Schrédinger field, but is less trivial for relativistic 
fields. In particular, massless fields, such as the photon, which travel at the speed 
of light, seem unlikely candidates for localization since they can never be halted 
in one place. 


11.2.1 Schrédinger field 
The Schrödinger field has a scalar product 


Ww = J dx yaya) 


d'k 
= Gay Y OVW. (11.6) 


Its wavefunctions automatically have positive energy, and thus the position 
operator may be written 


(èy) = J d'x yiya) 


SIE va 11.7 
= E 0° (1) Ho. (11.7) 


This is manifestly Hermitian. If one translates one of these wavefunctions a 
distance a from the other, then, using 


w(a) = e** (0), (11.8) 
one has 
VOVO) = fx WOW) = 51a) 
= [ox ene. (11.9) 
This is an identity. It shows that the Schrédinger wavefunction can be localized 


with delta-function precision. Point particles exist. 


11.2.2 Klein—Gordon field 


The Klein—Gordon field does not automatically have only positive energy 
solutions, so we must restrict the discussion to the set of solutions which have 
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strictly positive energy. The scalar product on this positive energy manifold is 
(P P) = fex (o ao p) 


= f (ake 6" yO) OKO + mch) 

(dk) 

2| pol 

A translation by a such that 6 (a) = e'“*¢9(k) makes the states orthogonal; 


(6 (a), 6 0) = (a) 


z J (dk)e™™? 


e 1g. 7 i (11.10) 


dk 
=| (dk) eika]? (11.11) 
2| pol 

For the last two lines to agree, we must have 

$i” (K) = v2] pol, (11.12) 
and thus the extent of the field about the point a is given by 

dk 

pP a-a) = SAD ikea (11.13) 


/21Pol 


which is not a delta function, and thus the Klein—Gordon particles do not exist 
in the same sense that Schrödinger particles do. There exist only approximately 
localizable concentrations of the field. The result of this integral in n dimensions 
can be expressed in terms of Bessel functions. For instance, inn = 3, 


oP a) ~ (ZY HP Gmr) (11.14) 
r 4 


where r = |x—al]. This lack of sharpness is reflected in the nature of the position 
operator X acting on these states: 


Ma), 86) | Ko" ROW. (11.15) 
Clearly, the partial derivative 2 3x 1S not a Hermitian operator owing to the factors 
of po in the measure. It is easy to show (see section 10.3) that the addition of 
the connection term, 


a) ries 
= 1— ie 
ðk | 2 p2 


é> 


(11.16) 


is what is required to make this operator Hermitian. 
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11.2.3 Dirac field 


The Dirac field also has both positive and negative energy states, and particle 
wavefunctions must be restricted to positive energies. It shares with the Klein- 
Gordon field the inability to produce sharp delta-function-like configurations 
of the field. The expression for the position operator is extremely complicated 
for the spin-4 particles, owing to the constraints imposed by the y-matrices. 
Although the procedure is the same, in principle, as for the Klein—Gordon field, 
the details are aggravated by the complexity of the field equations for the Dirac 
field. 

The scalar product for localizable solutions is now, by analogy with 
eqn. (11.11), 


(dk) 
(2p)? 
since there is no time derivative in the scalar product. Restricting to positive 


energies is also more complex, owing to the matrix nature of the equation. The 
normalized positive energy solutions include factors of 


E — Po 
N= = , (11.18) 
E+mc? (—po +mc) 


Gays wl’, (11.17) 


giving 
dk 
MP, ae} =) ) uN & Nu. (11.19) 
(2po) 
A suitable Hermitian operator for the position 
ð 
=N |—-i— +r] N 11.20 
X ( 1k + ) ( ) 


must now take into account all of these factors of the momentum. 


11.2.4 Spin s fields in 3 + 1 dimensions 
The generalization to any half-integral and integral massive spin fields can be 
accomplished using Dirac’s construction for spin L, It is only sketched here. A 


spin-s field may be written as a direct product of 2s spin-4 blocks. Following 
Wigner et al. [6, 101], the wavefunction may be written in momentum space as 


WK)a (11.21) 


where a = 1,...,2s represents the components of 2s four-component spin 
blocks (in total 2s x 4 components). The sub-spinors satisfy block-diagonal 
equations of motion: 


(YË Pu + MC) Ya = 0. (11.22) 
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The y-matrices all satisfy the Clifford algebra relation (see chapter 20), 
[vE y} = 2g. (11.23) 
The scalar product for localizable positive energy solutions may thus be found 
by analogy with eqn. (11.17): 
(ya) = | ODF v2.. ve 


imc] 2s+1 : 
= fow (=) Yi y2, (11.24) 


since, in the product over blocks, each normalization factor is multiplied in turn. 
Wigner et al. drop the factors of the mass arbitrarily in their definitions, since 
these contribute only dimensional factors. It is the factors of po which affect the 
localizability of the fields. The localizable wavefunction is thus of the form 


urepe (11.25) 
The normalization of the positive energy spinors is 
2s 
Dl (=) (11.26) 
E 2Po0 


Combining the factors of momentum, one arrives at a normalization factor of 


Po : | _2s+1 
N = | — | x 11.27 
(= + z) i ; , 


and a Hermitian position operator of the form 


z (dp) sd 
(y, xW) =f aon (uw (5 +r) vu). (11.28) 


Notice that the extra factors of the momentum lead to a greater de-localization. 
This expression contains the expressions for spin 0 and spin 5 as special cases. 
For massless fields, the above expressions hold for spin 0 and spin 4, but break 
down for spin 1, i.e. the photon. 


11.3 The energy-momentum tensor 0,,, 


Translational invariance of the action implies the conservation of momentum. 
Time-translation invariance implies the conservation of energy. Generally, 
invariance of one variable implies the conservation of its conjugate variable. 
In this section, we see how symmetry under translations of coordinates leads to 
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the definition of energy, momentum and shear stress in a mechanical system of 
fields. 

In looking at dynamical variations of the action, we have been considering 
changes in the function ¢(x). Now consider variations in the field which occur 
because we choose to translate or transform the coordinates x", 1.e. 


Sx (x) = (Oph (x))dx", (11.29) 
where we use 6, to distinguish a coordinate variation and 
ôx" = x" — x". (11.30) 


The variation of the action under such a change is given by 


ôS = faeo — fence. (11.31) 


which is manifestly zero, in the absence of boundaries, since the first term 
is simply a re-labelling of the second. We shall consider the action of an 
infinitesimal change ôx” and investigate what this tells us about the system. 
Since we are not making a dynamical variation, we can expect to find quantities 
which are constant with respect to dynamics. 

To calculate eqn. (11.31), we expand the first term formally about x: 


L(x’) = L(x) + 6L + 
= L(x) + (0,L)5x" + O((6x)*). (11.32) 


The volume element transforms with the Jacobian 


; ox’ 
(dx) = det (dx), (11.33) 
ox’ 
thus, we require the determinant of 


J, x’! = 54 + (8,6x"). (11.34) 


This would be quite difficult to compute generally, but fortunately we only 
require the result to first order in ôx”. Writing out the infinite-dimensional 
matrix explicitly, it is easy to see that all the terms which can contribute to first 
order lie on the diagonal: 


14 .0,6x! 06x? 
ôx!  1+8ôx? 12. |. (11.35) 
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Now, the determinant is the product of all the terms along the diagonal, plus 
some other terms involving off-diagonal elements which do not contribute to 
first order; thus, it is easy to see that we must have 


det(ð, x) = 1 + 3 ôx” + O((5x)?). (11.36) 


Using this result in eqn. (11.34), we obtain, to first order, 
ôS = fæ fa L® + (d,5x")L}. (11.37) 


Let us now use this result to consider the total variation of the action under a 
combined dynamical and coordinate variation. In principle, we should proceed 
from here for each Lagrangian we encounter. To make things more concrete, 
let us make the canonical assumption that we have a Lagrangian density which 
depends on some generic field g(x) and its derivative 0,,q (x). This assumption 
leads to correct results in nearly all cases of interest — it fails for gauge theories, 
because the definition of the velocity is not gauge-covariant, but we can return 
to that problem later. We take 


L = L (qŒ), Org), x”). (11.38) 


Normally, in a conservative system, x” does not appear explicitly, but we 
can include this for generality. Let us denote a functional variation by ôq as 
previously, and the total variation of g(x) by 


ôrq = ôq + (0,q)dx". (11.39) 


The total variation of the action is now 


sr = fæ Ea + ap agp oe) + (O,L)dx" + (ðn such 
(11.40) 


where the first two terms originate from the functional variation in eqn. (4.21) 
and the second two arise from the coordinate change in eqn. (11.32). We 
now make the usual observation that the ô variation commutes with the partial 
derivative (see eqn. (4.19)), and thus we may integrate by parts in the second 
and fourth terms of this expression to give 


s= fal (nt) 
: ôq 5.0) f 
+ fa fon Eon ; id + Lox |}. (11.41) 


One identifies the first line as being that which gives rise to the Euler-Lagrange 
field equations. This term vanishes by virtue of the field equations, for any 
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classically acceptable path. The remaining surface term can be compared with 
eqn. (4.62) and represents a generator for the combined transformation. We 
recognize the canonical momentum II, from eqn. (4.66). To display this term 
in its full glory, let us add and subtract 


Ze (0,g)dx” (11.42) 
——_(0,g)6x : 
5(uq) t 


to the surface term, giving 
1 
rS = - J do” {TI (8q + (0.g)5x") — Ovx” } 
C 
1 
= ~ fao [TL őrq — Ouvdx"}, (11.43) 
C 


where we have defined 


Ow = dg) — Leni 11.44 

u 5(0"gq) (vq) Eu ( ) 
This quantity is called the energy-momentum tensor. Its u, v = 0, 0 component 
is the total energy density or Hamiltonian density of the system. Its u,v = 
0, i components are the momentum components. In fact, if we expand out the 
surface term in eqn. (11.43) we have terms of the form 


Isg — Hét + pdx+---. (11.45) 


This shows how elegantly the action principle generates all of the dynamical 
entities of our covariant system and their respective conjugates (the delta objects 
can be thought of as the conjugates to each of the dynamical generators). 
Another way of expressing this is to say 


e TI is the generator of q translations, 
e H is the generator of t translations, 
e p is the generator of x translations, 


and so on. That these differential operators are the generators of causal changes 
can be understood from method 2 of the example in section 7.1. A single partial 
derivative has a complementary Green function which satisfies 


0, G(x, x’) = 8(x, x’). (11.46) 


This Green function is simply the Heaviside step function 6(f — ft’) from 
Appendix A, eqn. (A.2). What this is saying is that a derivative picks out a 
direction for causal change in the system. In other words, the response of the 
system to a source is channelled into a change in the coordinates and vice versa. 
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11.3.1 Example: classical particle mechanics 


To illustrate the energy-momentum tensor in the simplest of cases, we return to 
the classical system, with the Lagrangian given by eqn. (4.5). This Lagrangian 
has no uv indices, so our dogged Lorentz-covariant formalism is strictly wasted, 
but we may take u to stand for the time ¢ or position i and use the general 
expression. Recognizing that the metric for classical particles is 6,,, rather than 
8uv, we have 


oL. 
Ort = Gi — Lôn 
Ogi 
= pig’ -L 
ee 
esis og V@) 
=H. (11.47) 
The off-diagonal spacetime components give the momentum, 
ƏL ðqj i . 
í ðq; ðqi i 
and 
6, = —L, (11.49) 


which has no special interpretation. The off-diagonal ij components vanish in 
this case. 

The analogous analysis can be carried out for relativistic point particles. 
Using the action in eqn. (4.32), one finds that 


OL 
Orr = Big (OX) +L 


ƏL 
= (0;x) +L 

o'x 
= m? — -m + V’ 

1 a2 
=m +V, (11.50) 

where u = dx/drt is the velocity, or 
1 

Ou = mv +V. (11.51) 


11.3.2 Example: the complex scalar field 
The application of eqn. (11.44) for the action 


S= J (dx) {A7c?(8"ba)* (Oupa) +m c pipa + VO, 11.52) 
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gives us the following components for the energy-momentum tensor: 


600 = —~ > (oba) + ———~ (0064) — L800 


ƏL JL 
0(0%,4) aap) 
= hc? [(o64)(oba) + (Bip) Opa] + m7c* + VQ). 
(11.53) 


Thus, the last line defines the Hamiltonian density H, and the Hamiltonian is 
given by 


H= J dot. (11.54) 


The off-diagonal spacetime components define a momentum: 
al 
Ooi = Oio = ði 
o = Tap amon O a+ Ë ph) 
= hc? (0004) Opa) + Goha)(GiG'4)} - (11.55) 


Taking the integral over all space enables us to integrate by parts and write this 
in a form which turns out to have the interpretation of the expectation value 
(inner product) of the field momentum (see chapter 9): 


/ ich J do (#*:d0 — (d0b*)3:0) 


= —(¢, pico), (11.56) 
where p = —ifio;. The diagonal space components are given by 
Oii = zarr ~ (ipa) + ue ae = 
adha) Hh) 
= 2ħ°c (3p) Op) — L, (11.57) 


where i is not summed. Similarly, the off-diagonal ‘stress’ components are given 
by 
6:7 = eg 5 + e 
~ apa) ` apa) ` 
=R (0040p) + OAA Gia) } 
=h"'c(ba, pipia). (11.58) 


From eqn. (11.57), we see that the trace over spatial components in n + 1 
dimensions is 


X bu = H — 2m? — 2V ($) + (n — DEL, (11.59) 
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so that the full trace gives 
oH = g” Ovu = —2m* ctp — 2V (Q) + (n — DEL. (11.60) 


Note that this vanishes in 1 + 1 dimensions for zero mass and potential. 


11.3.3 Example: conservation 


We can also verify the energy-momentum conservation law, when the fields 
satisfy the equations of motion. We return to this issue in section 11.8.1. For the 
simplest example of a scalar field with action, 


1 eee 
S= |} (dx) 5 (Ob) On) ena fa (11.61) 
Using eqn. (11.44), we obtain the energy-momentum tensor 
1 | 
Puv = 5 Oupa) — smo". (11.62) 


The spacetime divergence of this is 


0“ Ou = —(—0 $ + m$) (3h) = 0. (11.63) 


The right hand side vanishes as a result of the equations of motion, and thus the 
conservation law is upheld. 

It is interesting to consider what happens if we add a potential V(x) to the 
action. This procedure is standard practice in quantum mechanics, for instance. 
This can be done by shifting the mass in the action by m? —> m? + V(x). The 
result of this is the following expression: 


Ou» = (Ag — (m? + V(x))b) (8) + (VG? 
= (V œ)’. (11.64) 


The first term vanishes again by virtue of the equations of motion. The 
spacetime-dependent potential does not vanish, however. (Conservation of 
energy is only assured if there are no spacetime-dependent potentials. This 
illustrates an important point, which is discussed more generally in section 
11.8.1. 

The reason that the conservation of energy is violated here is that a static 
potential of this kind is not physical. All real potentials change in response to 
an interaction with another field. By making a potential static, we are claiming 
that the form of V (x) remains unchanged no matter what we scatter off it. It is 
an immovable barrier. Conservation is violated because, in a physical system, 
we would take into account terms in the action which allow V(x) to change in 
response to the momentum imparted by @. See also exercise 1, at the end of this 
chapter. 
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11.4 Spacetime invariance and symmetry on indices 


For reasons which should become apparent in section 11.6.1, the energy- 
momentum tensor, properly defined under maximal symmetry, is symmetrical 
under interchange of its indices. This reflects the symmetry of the metric tensor 
under interchange of indices. If the Lorentz symmetry is broken, however (for 
instance, in the non-relativistic limit), then this property ceases to apply. In a 
relativistic field theory, a non-symmetrical tensor may be considered simply 
incorrect; in the non-relativistic limit, only the spatial part of the tensor is 
symmetrical. 


11.5 6, for gauge theories 


Consider the Maxwell action 
1 uv u 
S= | (dx) ane F= J" Aye. (11.65) 


A direct application of the formula in eqn. (11.44) gives an energy-momentum 
tensor which is not gauge-invariant: 


al 


6.) = ———_ 
em 3 (3H AL) 


1 

(0,A%) — gm Pee (11.66) 
The explicit appearance of A, in this result shows that this definition cannot 
be physical for the Maxwell field. The reason for this lack of gauge invariance 
can be traced to an inaccurate assumption about the nature of a translation, or 
conformal transformation of the gauge field [44, 76]; it is related to the gauge 
invariance of the theory. The expression for O,» in eqn. (11.44) relies on the 
assumption in eqn. (11.29) that the expression for the variation in the field by 
change of coordinates is given by 


Sy Ay = (Oy Ay )Ox%. (11.67) 


It is clear that this translation is not invariant with respect to gauge transforma- 
tions, but this seems to be wrong. After all, potential differences are observable 
as electric and magnetic fields between two points, and observable quantities 
should be gauge-invariant. In terms of this quantity, the energy-momentum 
tensor can be written as 


aL 


byydx" = ——— 
Ə (ƏH A%) 


1 
8A”) — — F” Foo 8uvôx”. 11.68 
(ôx A”) TER pogu ( ) 
Suppose now that we use this as a more fundamental definition of @,,,. Our 
problem is then to find a more appropriate definition of 6,A,, which leads to a 
gauge-invariant answer. The source of the problem is the implicit assumption 
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that the field at one point in spacetime should have the same phase as the field 
at another point. In other words, under a translation of coordinates, we should 
expect the field to transform like a vector only up to a gauge transformation. 
Generalizing the transformation rule for the vector potential to account for this 
simple observation cures the problem entirely. The correct definition of this 
variation was derived in section 4.5.2. 

The correct (gauge-invariant) transformation is now found by noting that we 
may write 


By Ap = (BALE + (By VAL 
= eF”, + 3p (€v A”). (11.69) 


This last term has the form of a gauge-invariant translation plus a term which 
can be interpreted as a gauge transformation 0“s (where s = e€ ,A”). Thus 
we may now re-define the variation 6,A” to include a simultaneous gauge 
transformation, leading to the gauge-invariant expression 


5, AH (x) = 5,A" — 3s = e F”, (11.70) 


where e” = ôx”. The most general description of the translation €“, in 3 + 1 
dimensions is a 15-parameter solution to Killing’s equation for the conformal 
symmetry [76], 


Ou€v + En — Temy =0, (11.71) 

with solution 
et (x) = a” + bx! +o” xy + 2x" x, — cx’, (11.72) 
where œw” = —w’". This explains why the conformal variation in the tensor Tuv 


gives the correct result for gauge theories: the extra freedom can accommodate 
x-dependent scalings of the fields, or gauge transformations. 

The anti-symmetry of F,, will now guarantee the gauge invariance of 
Oav. Using this expression in eqn. (11.43) for the energy-momentum tensor 
(recalling e” = ôx”) gives 


0, = = F*-£ 
wo gaan SHY 
oa 
= Spa hy T F8uv 
1 
=1 a o 
= Uo Fpa F, — — F” Foo8uw. (11.73) 
07H Ano pode 


This result is manifestly gauge-invariant and can be checked against the tradi- 
tional expressions obtained from Maxwell’s equations for the energy density and 
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the momentum flux. It also agrees with the Einstein energy-momentum tensor 
Tavs 
The components in 3 + 1 dimensions evaluate to: 


Ooo = ug (Foi Fo’ — £800) 


E E; 1 E, E; 
= + —— | B:Bi = 
c 


ol Ep 
Quo \ c2 


1 
=5Œ:D+B-H), (11.74) 


which has the interpretation of an energy or Hamiltonian density. The spacetime 
off-diagonal components are given by 
boj = Ojo = po FaF; 
= uz eijk Ei Bk /c 
_ (Ex Wy 


(11.75) 
c 


which has the interpretation of a ‘momentum’ density for the field. This 
quantity is also known as Poynting’s vector divided by the speed of light. The 
conservation law is 


1 = 
06,9 = ——9,;H + 3 (H x E)' = -3, S” = 0, (11.76) 
Cc C 


which may be compared with eqns. (2.70) and (2.73). Notice finally that 


5S 
50 =~ | doto, (11.77) 


and thus that 
—=-H (11.78) 


which is the energy density or Hamiltonian. We shall have use for this relation 
in chapter 14. 


11.6 Another energy-momentum tensor 7,,, 
11.6.1 Variational definition 


Using the action principle and the Lorentz invariance of the action, we have 
viewed the energy-momentum tensor 6,,,, as a generator for translations in space 
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and time. There is another quantity which we can construct which behaves as 
an energy-momentum tensor: it arises naturally in Einstein’s field equations 
of general relativity as a source term for matter. This tensor is defined by the 
variation of the action with respect to the metric tensor: 


ss 
eae Sah 


Clearly, this definition assumes that the action is covariant with respect to the 
metric guv, SO we should not expect this to work infallibly for non-relativistic 
actions. 

The connection between 7,,, and 6,,, is rather subtle and has to do with con- 
formal transformations. Conformal transformations (see section 9.6) are related 
to re-scalings of the metric tensor, and they form a super-group, which contains 
and extends the Lorentz transformation group; thus T, admits more freedom 
than @,,,. As it turns out, this extra freedom enables it to be covariant even 
for local gauge theories, where fields are re-defined by spacetime-dependent 
functions. The naive application of Lorentz invariance for scalar fields in section 
11.3 does not automatically lead to invariance in this way; but it can be fixed, as 
we shall see in the next section. The upshot of this is that, with the exception of 
the Maxwell field and the Yang—Mills field, these two tensors are the same. 

To evaluate eqn. (11.79), we write the action with the metric made explicit, 
and write the variation: 


1 ôg ôL 
= n+1 
5S = fa sve (patt £), (11.80) 


(11.79) 


where we recall that g = —det g,,,. To evaluate the first term, we note that 
ô ôdet gv 
fr eae oul, (11.81) 
bgt? bgt 


and use the identity 
Indetg,, = TrIn guv- (11.82) 


Varying this latter result gives 


ô In(detg,,,) = Tré In guv, (11.83) 
or 
ô(detg uv Z uv 
(detsu) _ O8uv (11.84) 
detg u» ge” 
Using this result, together with eqn. (11.81), in eqn. (11.80), we obtain 
al 
Lip 2 els (11.85) 


ogr” 
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This definition is tantalizingly close to that for the Lorentz symmetry variation, 
except for the replacement of the first term. In many cases, the two definitions 
give the same result, but this is not the case for the gauge field, where Ty 
gives the correct answer, but a naive application of @,,, does not. The clue as 
to their relationship is to consider how the metric transforms under a change of 
coordinates (see chapter 25). Relating a general action g,,, to a locally inertial 
frame nuv, one has 


Buv = V, Vf Nap, (11.86) 


where the vielbein Vi = ðX”, so that 


g” (I.P) (vb) = NA VEV p OLOA). (11.87) 
In terms of these quantities, one has 


2 8S Vau 8S 
Se es naan (11.88) 
Vg gh — det V Va 


Thus, one sees that variation with respect to a vector, as in the case of 0, 
will only work if the vector transforms fully covariantly under every symmetry. 
Given that the maximal required symmetry is the conformal symmetry, one may 
regard T„, as the correct definition of the energy-momentum tensor. 


11.6.2 The trace of the energy-momentum tensor T, 


The conformal invariance of the field equations is reflected in the trace of the 
energy-momentum tensor 7,,,, which we shall meet in the next chapter. Its 
trace vanishes for actions which are conformally invariant. To see this, we note 
that, in a conformally invariant theory, 


ôS 
S2 
If we express this in terms of the individual partial transformations, we have 


5S 8S ôg” Sp 


ER g” Q pR 


0. (11.89) 


0. (11.90) 


Assuming that the transformation is invertible, and that the field equations are 
satisfied, 


5S 
ee (11.91) 
ô 
we have 
aT e ci) (11.92) 
ar TS i 
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bgt? 


Since T 


must be proportional to g””, we have simply that 
Tie = Tr Tw = O. (11.93) 


A similar argument applies to the tensor 6,,,, since the two tensors (when defined 
correctly) agree. In the absence of conformal invariance, one may expand the 
trace in the following way: 


TH = Bl’, (11.94) 


where CL’ are terms in the Lagrangian of ith order in the fields. 6' is then called 
the beta function for this term. It occurs in renormalization group and scaling 
theory. 


11.6.3 The conformally improved T,,, 


The uncertainty in the definition of the energy-momentum tensors 0,,, and Tuv 
is usually understood as the freedom to change boundary conditions by adding 
total derivatives, i.e. surface terms, to the action. However, another explanation 
is forthcoming: such boundary terms are generators of symmetries, and one 
would therefore be justified in suspecting that symmetry covariance plays a 
role in the correctness of the definition. It has emerged that covariance, with 
respect to the conformal symmetry, frequently plays a role in elucidating a 
sensible definition of this tensor. While this symmetry might seem excessive 
in many physical systems, where one would not expect to see such a symmetry, 
its structure encompasses a generality which ensures that all possible terms are 
generated, before any limit is taken. 

In the case of the energy-momentum tensor, the conformal symmetry mo- 
tivates improvements not only for gauge theories, but also with regard to 
scaling anomalies. The tracelessness of the energy-momentum tensor for a 
massless field is only guaranteed in the presence of conformal symmetry, but 
such a symmetry usually demands a specific spacetime dimensionality. What is 
interesting is that a fully covariant, curved spacetime formulation of T„, leads 
to an invariant definition, which ensures a vanishing T% in the massless limit 
[23, 26, 119]. 

The freedom to add total derivatives means that one may write 


Tav > Tiv + VP VIM pvpo, (11.95) 


where mM „vpo is a function of the metric tensor, and is symmetrical on u, v and 
p, o indices; additionally it satisfies: 


M uvpo + Mpvou + Movup = Q. (11.96) 
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These are also the symmetry properties of the Riemann tensor (see eqn. (25.24)). 
This combination ensures that the additional terms are conserved: 


Vl! AT yy = VEV? VIM ppo = O. (11.97) 


The properties of the Riemann tensor imply that the following additional 
invariant term may be added to the action: 


AS = fæ é m"? R uvpo- (11.98) 
For spin-0 fields, the only invariant combination of correct dimension is 
1 1 
m"””? = (ee = 58 8 = ee) p, (11.99) 


which gives the term 


as= f ER, (11.100) 


where R is the scalar curvature (see chapter 25). Thus, the modified action, 
which must be temporarily interpreted in curved spacetime, is 


1 1 
S= fæ KG + AG + eog] , (11.101) 
where (dx) = Jgd x, Varying this action with respect to the metric leads to 


1 
Tuv = (Vuo) (Vv) = 58w [(V*b)(Vi.b) +m] 
+ E(V, Vy = Sw). (11.102) 


Notice that the terms proportional to € do not vanish, even in the limit R — 0, 
i.e. V, — 0,. The resulting additional piece is a classic (n + 1) dimensional, 
transverse (conserved) vector displacement. Indeed, it has the conformally 
invariant form of the Maxwell action, stripped of its fields. The trace of this 
tensor may now be computed, giving: 


7 l-n 1 22 
TY = [=A + 26n] aAA) — 50+ Dm. (11.103) 


One now sees that it is possible to choose & such that it vanishes in the massless 
limit; i.e. 


1 
= -50 +1)m’¢’, (11.104) 


where 


= , 11.105 
4n ( ) 


This value of & is referred to as conformal coupling. In 3 + 1 dimensions, it has 
the value of i, which is often assumed explicitly. 
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11.7 Angular momentum and spin! 


The topic of angular momentum in quantum mechanics is one of the clas- 
sic demonstrations of the direct relevance of group theory to the nature of 
microscopic observables. Whereas linear momentum more closely resembles 
its Abelian classical limit, the microscopic behaviour of rotation at the level 
of particles within a field is quite unexpected. The existence of intrinsic, 
half-integral spin S, readily predicted by representation theory of the rotation 
group in 3 + 1 dimensions, has no analogue in a single-valued differential 
representation of the orbital angular momentum L. 


11.7.1 Algebra of orbital motion in 3 + 1 dimensions 


The dynamical commutation relations of quantum mechanics fix the algebra 
of angular momentum operators. It is perhaps unsurprising, at this stage, 
that the canonical commutation relations for position and momentum actually 
correspond to the Lie algebra for the rotation group. The orbital angular 
momentum of a body is defined by 


L=rxp. (11.106) 
In component notation in n-dimensional Euclidean space, one writes 
Li = €ijxx! p*. (11.107) 
The commutation relations for position and momentum 
[x!, p] =ix, 8” (11.108) 
then imply that (see section 11.9) 
[Li, Lj] =i Xn Eijk Lk- (11.109) 


This is a Lie algebra. Comparing it with eqn. (8.47) we see the correspondence 
between the generators and the angular momentum components, 


T° <> L°) Xn 
Jabe = —Eabe> (11.110) 


with the group space a,b,c < i, j,k corresponding to the Euclidean spatial 
basis vectors. What this shows, however, is that the group theoretical description 
of rotation translates directly into the operators of the dynamical theory, with a 


1 A full understanding of this section requires a familiarity with Lorentz and Poincaré symmetry 
from section 9.4. 
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dimensionful scale x, , which in quantum mechanics is x; = A. This happens, 
as discussed in section 8.1.3, because we are representing the dynamical 
variables (fields or wavefunctions) as tensors which live on the representation 
space of the group (spacetime) by a mapping which is adjoint (the group space 
and representation space are the same). 


11.7.2 The nature of angular momentum in n + 1 dimensions 


In spite of its commonality, the nature of rotation is surprisingly non-intuitive, 
perhaps because many of its everyday features are taken for granted. The 
freedom for rotation is intimately linked to the dimension of spacetime. This 
much is clear from intuition, but, as we have seen, the physics of dynamical 
systems depends on the group properties of the transformations, which result 
in rotations. Thus, to gain a true intuition for rotation, one must look to the 
properties of the rotation group in n + 1 dimensions. 

In one dimension, there are not enough degrees of freedom to admit rotations. 
In 2 + 1 dimensions, there is only room for one axis of rotation. Then we have 
an Abelian group U(1) with continuous eigenvalues exp(i@). These ‘circular 
harmonics’ or eigenfunctions span this continuum. The topology of this space 
gives boundary conditions which can lead to any statistics under rotation. i.e. 
anyons. 

In 3 + 1 dimensions, the rank 2-tensor components of the symmetry group 
generators behave like two separate 3-vectors, those arising in the timelike 
components T“ and those arising in the spacelike components 4¢‘/*T;;; indeed, 
the electric and magnetic components of the electromagnetic field are related 
to the electric and magnetic components of the Lorentz group generators. 
Physically, we know that rotations and coils are associated with magnetic fields, 
so this ought not be surprising. The rotation group in 3 + 1 dimensions is 
the non-Abelian $O(3), and the maximal Abelian sub-group (the centre) has 
eigenvalues +1. These form a Z2 sub-group and reflect the topology of the 
group, giving rise to two possible behaviours under rotation: symmetrical and 
anti-symmetrical boundary conditions corresponding in turn to Bose-Einstein 
and Fermi—Dirac statistics. 

In higher dimensions, angular momentum has a tensor character and is 
characterized by n-dimensional spherical harmonics [130]. 


11.7.3 Covariant description in 3 + 1 dimensions 


The angular momentum of a body at position r, about an origin, with momentum 
p, is defined by 


J=L+S=(rxp)+S. (11.111) 
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The first term, constructed from the cross-product of the position and linear 
momentum, is the contribution to the orbital angular momentum. The second 
term, S, is the spin, or intrinsic angular momentum, of the body. The total 
angular momentum is a conserved quantity and may be derived from the energy— 
momentum tensor in the following way. Suppose we have a conserved energy— 
momentum tensor 6,,,, which is symmetrical in its indices (Lorentz-invariant), 
then 


30” =O: (11.112) 
We can construct the new axial tensor, 
L agg are, (11.113) 
which is also conserved, since 
3 L” SO SOS (11.114) 


Comparing eqn. (11.113) with eqn. (11.111), we see that L“”* is a generalized 
vector product, since the components of r x p are of the form Lı = r2 p3 — r3 p2, 
or Li = €ijkrj pr. We may then identify the angular momentum 2-tensor as the 
anti-symmetrical matrix 


J” = fa TI ee (11.115) 


which is related to the generators of homogeneous Lorentz transformations 
(generalized rotations on spacetime) by 


J"? 


co = Xt Te (11.116) 


see eqn. (9.95). The ij components of J”” are simply the components of r x p. 
The 10 components are related to boosts. Clearly, this matrix is conserved, 


ðu J” = 0. (11.117) 


Since the coordinates x“ appear explicitly in the definition of J””, it is not 
invariant under translations of the origin. Under the translation x” > x” +a", 
the components transform into 


JEP —> J” + (a” p” +a"p”). (11.118) 


(see eqn. (11.5)). This can be compared with the properties of eqn. (9.153). 
To isolate the part of T,» which is intrinsic to the field (i.e. is independent of 
position), we may either evaluate in a rest frame p; = O or define, in 3 + 1 
dimensions, the dual tensor 


1 
S= zema” ZS (11.119) 
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The anti-symmetry of the Levi-Cevita tensor ensures that the extra terms in 
eqn. (11.118) cancel. We may therefore think of this as being the generator 
of the intrinsic angular momentum of the field or spin. This dual tensor is 
rather formal though and not very useful in practice. Rather, we consider the 
Pauli—Lubanski vector as introduced in eqn. (9.161). We define a spin 4-vector 
by 


1 1 P 
5 Sp = Xn Wu = ZE J” ph, (11.120) 


so that, in a rest frame, 
1 : 
Xn Wies = =3mc (0, S'), (11.121) 
where S' is the intrinsic spin angular momentum, which is defined by 


F ; 1 $ 
Sg o E a (11.122) 
pi=0 2 i 


with eigenvalues s(s + 1) xn? and ms Xn, where s =e + f. 


11.7.4 Intrinsic spin of tensor fields in 3 + 1 dimensions 


Tensor fields are classified by their intrinsic spin in 3 + 1 dimensions. We 
speak of fields with intrinsic spin 0, $, 1, 3, 2,.... These labels usually refer 
to 3 + 1 dimensions, and may differ in other number of dimensions since they 
involve counting the number of independent components in the tensors, which 
differs since the representation space is spacetime for the Lorentz symmetry. 
The number depends on the dimension and transformation properties of the 
matrix representation, which defines a rotation of the field. The homogeneous 
(translation independent) Lorentz group classifies these properties of the field in 
3 + 1 dimensions, 


Field Spin 
p(x) 0 
Valx) $ 
Ae 1 
3 
Ve 3 
Suv 2 


where u, v = 0,1, 2,3. Although fields are classified by their spin properties, 
this is not enough to be able to determine the rotational modes of the field. The 
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mass also plays a role. This is perhaps most noticeable for the spin-1 field A,,. 
In the massless case, it has helicities à = +1, whereas in the massive case it can 
take on the additional value of zero. The reason for the difference follows from 
a difference in the true spacetime symmetry of the field in the two cases. We 
shall explore this below. 

From section 9.4.3 we recall that the irreducible representations of the Lorentz 
group determine the highest weight or spin s = e + f of a field. If we set 
the generators of boosts to zero by taking œo; T” = 0 in eqn. (9.95), then we 
obtain the pure spatial rotations of section 8.5.10. Then the generators of the 
Lorentz group E; and F; become identical, and we may define the spin of a 
representation by the operator 


S; = E; + Fi = xn Tpi- (11.123) 


The Casimir operator for the defining (vector field) representation is then 


S =x ha (11.124) 


oooco 
oOoNO SO 


0 
0 
0 
2 


SoOoON OC 


This shows that the rotational 3-vector part of the defining representation forms 
an irreducible module, leaving an empty scalar component in the time direction. 
One might expect this; after all, spatial rotations ought not to involve timelike 
components. If we ignore the time component, then we easily identify the spin 
of the vector field as follows. From section 8.5.10 we know that in representation 
Gr, the Casimir operator is proportional to the identity matrix with value 


S? = S'S, =s(s +1)x7 Ir, (11.125) 


and s = e+ f. Comparing this with eqn. (11.124) we have s(s + 1) = 2, thus 
s = 1 for the vector field. We say that a vector field has spin 1. 

Although the vector transformation leads us to a value for the highest weight 
spin, this does not necessarily tell us about the intermediate values, because 
there are two ways to put together a spin-1 representation. One of these applies 
to the massless (transverse) field and the other to the massive Proca field, which 
was discussed in section 9.4.4. As another example, we take a rank 2-tensor 
field. This transforms like 


Give PLP Gpo: (11.126) 


In other words, two vector transformations are required to transform this, one 
for each index. The product of two such matrices has an equivalent vector form 
with irreducible blocks: 
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d,l) @& (d,0 80,1) ® (0,0). 

=- e+ ae 
traceless + anti-symmetric + trace 
symmetric 


This is another way of writing the result which was found in section 3.76 using 
more pedestrian arguments. The first has (2e + 1)(2f +1) =9(e= f=1) 
spin e + f = 2 components; the second two blocks are six spin-1 parts; and the 
last term is a single scalar component, giving 16 components in all, which is the 
number of components in the second-rank tensor. 

Another way to look at this is to compare the number of spatial components 
in fields with 2s + 1. For scalar fields (spin 0), 2s + 1 gives one component. A 
4-vector field has one scalar component and 2s +1 = 3 spatial components (spin 
1). A spin-2 field has nine spatial components: one scalar (spin-0) component, 
three vector (spin-1) components and 2s + 1 = 5 remaining spin-2 components. 
This is reflected in the way that the representations of the Lorentz transformation 
matrices reduce into diagonal blocks for spins 0, 1 and 2. See ref. [132] for a 
discussion of spin-2 fields and covariance. 

It is coincidental for 3 + 1 dimensions that spin-O particles have no Lorentz 
indices, spin-1 particles have one Lorentz index and spin-2 particles have two 
Lorentz indices. 

What is the physical meaning of the spin label? The spin is the highest weight 
of the representation which characterizes rotational invariance of the system. 
Since the string of values produced by the stepping operators moves in integer 
steps, it tells us how many distinct ways, m + m’, a system can spin in an 
‘equivalent’ fashion. In this case, equivalent means about the same axis. 


11.7.5 Helicity versus spin 
Helicity is defined by 


A= J; pi. (11.127) 


Spin s and helicity A are clearly related quite closely, but they are subtly 
different. It is not uncommon to refer loosely to helicity as spin in the literature 
since that is often the relevant quantity to consider. The differences in rotation 
algebras, as applied to physical states are summarized in table 11.3. Because 
the value of the helicity is not determined by an upper limit on the total 
angular momentum, it is conventional to use the component of the spin of the 
irreducible representation for the Lorentz group which lies along the direction 
of the direction of travel. Clearly these two definitions are not the same thing. In 
the massless case, the labels for the helicity are the same as those which would 
occur for m ; in the rest frame of the massive case. 

From eqn. (11.127) we see that the helicity is rotationally invariant for 
massive fields and generally Lorentz-invariant for massless pọ = O fields. 
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Table 11.3. Spin and helicity. 


Casimir A,=m; 


Massive j(j+1) 0,43,41,...,4/ 
Massless 0 0, +5, +1,..., 00 


It transforms like a pseudo-scalar, since J; is a pseudo-vector. Thus, the 
sign of helicity changes under parity transformations, and a massless particle 
which takes part in parity conserving interactions must have both helicity states 
A, i.e. we must represent it by a (reducible) symmetrized pair of irreducible 


representations: 
( 3 “ú ) or ( X H ). (11.128) 


The former is the case for the massless Dirac field (A = +3), while the 
latter is true for the photon field F“” (A = +1), where the states correspond 
to left and right circularly polarized radiation. Note that, whereas a massive 
particle could have à = 0, +1, representing left transverse, right transverse 
and longitudinal angular momentum, a massless (purely transverse) field cannot 
have a longitudinal mode, so à = 0 is absent. This can be derived more 
rigorously from representation theory. 

In refs. [45, 55], the authors study massless fields with general spin and show 
that higher spins do not necessarily have to be strictly conserved; only the Dirac- 
traceless part of the divergence has to vanish. 


11.7.6 Fractional spin in 2 + 1 dimensions 


The Poincaré group in 2 + 1 dimensions shares many features of the group 
in 3 + 1 dimensions, but also conceals many subtleties [9, 58, 77]. These 
have specific implications for angular momentum and spin. In two spatial 
dimensions, rotations form an Abelian group SO (2) ~ U(1), whose generators 
can, in principle, take on eigenvalues which are unrestricted by the constraints 
of spherical harmonics. This leads to continuous phases [89, 138], particle 
statistics and the concept of fractional spin. It turns out, however, that there is a 
close relationship between vector (gauge) fields and spin in 2 + 1 dimensions, 
and that fractional values of spin can only be realized in the context of a gauge 
field coupling. This is an involved topic, with a considerable literature, which 
we shall not delve into here. 
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11.8 Work, force and transport in open systems 


The notion of interaction and force in field theory is unlike the classical 
picture of particles bumping into one another and transferring momentum. Two 
fields interact in the manner of two waves passing through one another: by 
interference, or amplitude modulation. Two fields are said to interact if there is 
a term in the action in which some power of one field multiplies some power of 
another. For example, 


Sin = fæ [9 A,A" }. (11.129) 


Since the fields multiply, they modulate one another’s behaviour or perturb 
one another. There is no explicit notion of a force here, and precisely what 
momentum is transferred is rather unclear in the classical picture; nevertheless, 
there is an interaction. This can lead to scattering of one field off another, for 
instance. 

The source terms in the previous section have the form of an interaction, 
in which the coupling is linear, and thus they exert what is referred to as a 
generalized force on the field concerned. The word generalized is used because 
J does not have the dimensions of force — what is important is that the source 
has an influence on the behaviour of the field. 

Moreover, if we place all such interaction terms on the right hand side of 
the equations of motion, it is clear that interactions also behave as sources for 
the fields (or currents, if you prefer that name). In eqn. (11.129), the coupling 
between ¢ and A, will lead to a term in the equations of motion for ¢ and for 
An, thus it acts as a source for both fields. 

We can express this in other words: an interaction can be thought of as a 
source which transfers some ‘current’ from one field to another. But be wary 
that what we are calling heuristically ‘current’ might be different in each case 
and have different dimensions. 

A term in which a field multiplies itself, @”, is called a self-interaction. In 
this case the field is its own source. Self-interactions lead to the scattering of 
a field off itself. The classical notion of a force was described in terms of the 
energy-momentum tensor in section 11.3. 


11.8.1 The generalized force F, = 0,,T"” 


There is a simple proof which shows that the tensor 7;,, is conserved, provided 
one has Lorentz invariance and the classical equations of motion are satisfied. 
Consider the total dynamical variation of the action 


ôS ôS 
ôS = J ôg” + J —ôq = 0. (11.130) 
bgt ôq 
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Since the equations of motion are satisfied, the second term vanishes identically, 
leaving 


1 
8S = SNE | Tug". (11.131) 


For simplicity, we shall assume that the metric g,,, is independent of x, so that 
the variation may be written (see eqn. (4.88)) 


ôS = [oot [Sur (Ove) + gape] = 0. (11.132) 
Integrating by parts, we obtain 
ôS = foo [—2a,7""] € = 0. (11.133) 


Since e“ (x) is arbitrary, this implies that 
ðuLTH” = 0, (11.134) 


and hence T*” is conserved. From this argument, it would seem that T”#” 
must always be conserved in every physical system, and yet one could imagine 
constructing a physical model in which energy was allowed to leak away. The 
assumption of Lorentz invariance and the use of the equations of motion provide 
acatch, however. While it is true that the energy-momentum tensor is conserved 
in any complete physical system, it does not follow that energy or momentum 
is conserved in every part of a system individually. If we imagine taking 
two partial systems and coupling them together, then those two systems can 
exchange energy. In fact, energy will only be conserved if the systems are in 
perfect balance: if, on the other hand, one system does work on the other, then 
energy flows from one system to the other. No energy escapes the total system, 
however. 

Physical systems which are coupled to other systems, about which we have 
no knowledge, are called open systems. This is a matter of definition. Given 
any closed system, we can make an open system by isolating a piece of it and 
ignoring the rest. Clearly a description of a piece of a system is an incomplete 
description of the total system, so it appears that energy is not conserved in the 
small piece. In order to see conservation, we need to know about the whole 
system. This situation has a direct analogue in field theory. Systems are placed 
in contact with one another by interactions, often through currents or sources. 
For instance, Dirac matter and radiation couple through a term which looks like 
J"A,,. If we look at only the Dirac field, the energy-momentum tensor is not 
conserved. If we look at only the radiation field, the energy-momentum tensor 
is not conserved, but the sum of the two parts is. The reason is that we have to 
be ‘on shell’ — i.e., we have to satisfy the equations of motion. 
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Consider the following example. The (incomplete) action for the interaction 
between the Dirac field and the Maxwell field is 


1 
S= fæ —— FUF wu = J" Aes (11.135) 
4uo 


where J” = Wy". Now, computing the energy-momentum tensor for this 
action, we obtain 


pT = FY’ Jp. (11.136) 


This is not zero because we are assuming that the current J” is not zero. But 
this is not a consistent assumption in the action, because we have not added 
any dynamics for the Dirac field, only the coupling J“ A,,. Consider the field 
equation for y from eqn. (11.135). Varying with respect to y, 


ONY 
sw ey uY ( ) 


This means that either A, = 0 or y = 0, but both of these assumptions make 
the right hand side of eqn. (11.136) zero! So, in fact, the energy-momentum 
tensor is conserved, as long as we obey the equations of motion given by the 
variation of the action. 

The ‘paradox’ here is that we did not include a piece in the action for the 
Dirac field, but that we were sort of just assuming that it was there. This is a 
classic example of writing down an incomplete (open) system. The full action, 


S= fæ [F Ep — J"A„ + wy" dy + my) , (11.138) 


has a conserved energy-momentum tensor, for more interesting solutions than 
y=0. 

From this discussion, we can imagine the imbalance of energy-momentum 
on a partial system as resulting in an external force on this system, just as in 
Newton’s second law. Suppose we define the generalized external force by 


P= fa Oph (11.139) 


The spatial components are 


i oi i _ dp 
F = fa dT” = 0,P =a (11.140) 
which is just Newton’s second law. Compare the above discussion with 
eqn. (2.73) for the Poynting vector. 

An important lesson to learn from this is that a source is not only a generator 
for the field (see section 14.2) but also a model for what we do not know about 
an external system. This is part of the essence of source theory as proposed by 
Schwinger. For another manifestation of this, see section 11.3.3. 
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11.8.2 Work and power 


In chapter 5 we related the imaginary part of the Feynman Green function to 
the instantaneous rate at which work is done by the field. We now return to this 
problem and use the energy-momentum tensor to provide a new perspective on 
the problem. 

In section 6.1.4 we assumed that the variation of the action with time, 
evaluated at the equations of motion, was the energy of the system. It is now 
possible to justify this; in fact, it should already be clear from eqn. (11.78). We 
can go one step further, however, and relate the power loss to the notion of an 
open system. If a system is open (if it is coupled to sources), it does work, w. 
The rate at which it does work is given by 


d 
pice fa 3 T”. (11.141) 
dt 

This has the dimensions of energy per unit time. It is clearly related to the 
variation of the action itself, evaluated at value of the field which satisfies the 
field equations, since 


ôS 


Aw = = f aoar 3T”? = —— (11.142) 


field eqns 


The electromagnetic field is the proto-typical example here. If we consider the 
open part of the action (the source coupling), 


Sy = fan IPA. (11.143) 
then, using 
A, = fæ Dur, xJ”, (11.144) 
we have 
8S[Aj] = 8 faw J"SA, 
= [eevee Gyo x)" a’) 


= J (dx) (3 T”?)ôt 
= Awôt. (11.145) 


The Green function we choose here plays an important role in the discussion, 
as noted in section 6.1.4. There are two Green functions which can be used 
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in eqn. (11.144) as the inverse of the Maxwell operator: the retarded Green 
function and the Feynman Green function. The key expression here is 


W = TICOS xI. (11.146) 


Since the integral is spacetime symmetrical, only the symmetrical part of the 
Green function contributes to the integral. This immediately excludes the 
retarded Green function 


11.8.3 Hydrodynamic flow and entropy 


Hydrodynamics is not usually regarded as field theory, but it is from hydro- 
dynamics (fluid mechanics) that we derive notions of macroscopic transport. 
All transport phenomena and thermodynamic properties are based on the idea 
of flow. The equations of hydrodynamics are the Navier-Stokes equations. 
These are non-linear vector equations with highly complex properties, and their 
complete treatment is outside the scope of this book. In their linearized form, 
however, they may be solved in the usual way of a classical field theory, using 
the methods of this book. We study hydrodynamics here in order to forge a 
link between field theory and thermodynamics. This is an important connection, 
which is crying out to be a part of the treatment of the energy-momentum tensor. 
We should be clear, however, that this is a phenomenological addition to the field 
theory for statistically large systems. 

A fluid is represented as a velocity field, U” (x), such that each point in a 
system is moving with a specified velocity. The considerations in this section do 
not depend on the specific nature of the field, only that the field is composed of 
matter which is flowing with the velocity vector U”. Our discussion of flow will 
be partly inspired by the treatment in ref. [134], and it applies even to relativistic 
flows. As we shall see, the result differs from the non-relativistic case only by a 
single term. A stationary field (fluid) with maximal spherical symmetry, in flat 
spacetime, has an energy-momentum tensor given by 


Too = H 
Toi = Tio = 0 
Tij Pay, (11.147) 


In order to make this system flow, we may perform a position-dependent boost 
which places the observer in relative motion with the fluid. Following a boost, 
the energy-momentum tensor has the form 


TY = Pg” + (P+ H)UYU" /c’. (11.148) 


The terms have the dimensions of energy density. P is the pressure exerted by 
the fluid (clearly a thermodynamical average variable, which summarizes the 


11.8 Work, force and transport in open systems 315 


microscopic thermal motion of the field). H is the internal energy density of the 
field. Let us consider the generalized thermodynamic force F” = 0,T*”. Ina 
closed thermodynamic system, we know that the energy-momentum tensor is 
conserved: 


F” =90,T" =0, (11.149) 
and that the matter density N (x) in the field is conserved, 
ðL N“ = 0, (11.150) 


where N, = N(x)U,. If we think of the field as a plasma of particles, then 
N(x) is the number of particles per unit volume, or number density. Due to its 
special form, we may write 


ð N” = (0,.N)U" + (8,U%), (11.151) 


which provides a hint that the velocity boost acts like a local scaling or 
conformal transformation on space 


=c7 dr? + dx;dx! > —c°dt? + Q (U)dx;dx'. (11.152) 


The average rate of work done by the field is zero in an ideal, closed system: 


d 
San [oo U,F” 
dr 


= fa [U“3 P — 0, ((P + H)U")| 
=0. (11.153) 


Now, noting the identity 


Na, (55) = a(P+H)- (25) (P +H), (11.154) 


we may write 


dw P+H 
Rees OS u a 
= fa U a,” n ( a J|. (11.155) 


Then, integrating by parts, assuming that U” is zero on the boundary of the 
system, and using the identity in eqn. (11.151) 


er 


= - | a NU" [P3 V + 0,4], (11.156) 
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where V is the volume per particle and H is the internal energy. This expression 
can be compared with 


TdS = PdV +dH. (11.157) 


Eqn. (11.156) may be interpreted as a rate of entropy production due to the 
hydrodynamic flow of the field, i.e. it is the rate at which energy becomes 
unavailable to do work, as a result of energy diffusing out over the system 
uniformly or as a result of internal losses. We are presently assuming this to 
be zero, in virtue of the conservation law, but this can change if the system 
contains hidden degrees of freedom (sources/sinks), such as friction or viscosity, 
which convert mechanical energy into heat in a non-useful form. Combining 
eqn. (11.156) and eqn. (11.157) we have 


- f anuar = [ sou.a.r" = 0, (11.158) 


From this, it is useful to define a covariant entropy density vector S”, which 
symbolizes the rate of loss of energy in the hydrodynamic flow. In order to 
express the right hand side of eqn. (11.158) in terms of gradients of the field and 
the temperature, we integrate by parts and define. Let 


U, 
c(3 S") = d, (=) T”, (11.159) 
where 


S! = NSU" — La 


(11.160) 


The zeroth component, cS° = NS, is the entropy density, so we may interpret 
S” as a spacetime entropy vector. Let us now assume that hidden losses can 
cause the conservation law to be violated. Then we have the rate of entropy 
generation given by 


c(0,S") = [-F@u) + FOTW, | T”, (11.161) 


We shall assume that the temperature is independent of time, since the simple 
arguments used to address statistical issues at the classical level do not take into 
account time-dependent changes properly: the fluctuation model introduced in 
section 6.1.5 gives rise only to instantaneous changes or steady state flows. If 
we return to the co-moving frame in which the fluid is stationary, we have 


U; = ð U?’ = 3T = 0, (11.162) 
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and thus 
u 1 1 6: 
(0,5") = — zzp (Ud) + 72 OF) T 
1 i 
+ ap oui +ə;UpDT”. (11.163) 


Note the single term which vanishes in the non-relativistic limit c —> oo. This is 
the only sign of the Lorentz covariance of our formulation. Also, we have used 
the symmetry of T’ to write 3U; in an ij-symmetric form. 

So far, these equations admit no losses: the conservation law cannot be 
violated: energy cannot be dissipated. To introduce, phenomenologically, 
an expression of dissipation, we need so-called constitutive relations which 
represent average ‘frictional forces’ in the system. These relations provide a 
linear relationship between gradients of the field and temperature and the rate of 
entropy generation, or energy stirring. The following well known forms are used 
in elementary thermodynamics to define the thermal conductivity « in terms of 
the heat flux Q; and the temperature gradient; similarly the viscosity n in terms 
of the pressure P: 


0, = tZ 
BT 
ðU; 

j No ( ) 


The relations we choose to implement these must make the rate of entropy 
generation non-negative if they are to make thermodynamical sense. It may 
be checked that the following definitions fulfil this requirement in n spatial 
dimensions: 


T” = —« (9;T + T,U;/c’) 
5 2 
Piae (aw, +3); — Zawa) — £(%U*)8;;, (11.165) 
i P . 
where « is the thermal conductivity, 7 is the shear viscosity and ¢ is the bulk 
viscosity. The first term in this last equation may be compared with eqn. (9.217). 


This makes use of the definition of shear o;; for a vector field V; as a conformal 
deformation 


2 
Aij = 3i Vj + 3; V; — (3 V") ð;;. (11.166) 
| > 


This is a measure of the non-invariance of the system to conformal, or shearing 
transformations. Substituting these constitutive equations into eqn. (11.163), 
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one obtains 


8,8" = Fs (AT + TIU; NOT + TU! /c) 
at 
2T 


4 \1 a 
= Sars yU y. (11.167) 


(a;U; + 3;U;) (a'U! + 3t!) 


11.8.4 Thermodynamical energy conservation 


The thermodynamical energy equations supplement the conservation laws for 
mechanical energy, but they are of a different character. These energy equations 
are average properties for bulk materials. They summarize collective micro- 
scopic conservation on a macroscopic scale. 


ð TH” = H + TdS + PdV +dF (11.168) 
S=kn2 (11.169) 

d2 1dQ 
TdS =kT— = ——. (11.170) 

2Q BA 


11.8.5 Kubo formulae for transport coefficients 


In section 6.1.6, a general scheme for computing transport coefficients was 
presented, but only the conductivity tensor was given as an example. Armed 
with a knowledge of the energy-momentum tensor, entropy and the dissipative 
processes leading to viscosity, we are now in a position to catalogue the most 
important expressions for these transport coefficients. The construction of the 
coefficients is based on the general scheme outlined in section 6.1.6. In order to 
compute these coefficients, we make use of the assumption of linear dissipation, 
which means that we consider only first-order gradients of thermodynamic 
averages. This assumes a slow rate of dissipation, or a linear relation of the 
form 


(variable) = kV, (source), (11.171) 


where V,, represents some spacetime gradient. This is the so-called constitutive 
relation. The expectation values of the variables may be derived from the 
generating functional W in eqn. (6.7) by adding source terms, or variables 
conjugate to the ones we wish to find correlations between. The precise meaning 
of the sources is not important in the linear theory we are using, since the source 
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Table 11.4. Conductivity tensor. 


Component Response Measure 
O09 /c? induced density charge compressibility 
Ooi /C density current — 
Oji induced current linear conductivity 
Oij induced current transverse (Hall) conductivity 


cancels out of the transport formulae completely (see eqn. (6.66)). Also, there is 
a symmetry between the variables and their conjugates. If we add source terms 


So S+ [dU -A+ JAn + IAW) (11.172) 


then the J’s are sources for the A’s, but conversely the A’s are also sources for 
the J’s. 

We begin therefore by looking at the constitutive relations for the transport 
coefficients, in turn. The generalization of the conductivity derived in eqn. (6.75) 
for the spacetime current is 


J, = 09, A”. (11.173) 


Although derivable directly from Ohm’s law, this expresses a general dissipative 
relationship between any current J“ and source A“, so we would expect this 
construction to work equally well for any kind of current, be it charged or not. 
From eqn. (11.171) and eqn. (6.66) we have the Fourier space expression for the 
spacetime conductivity tensor in terms of the Feynman correlation functions 


Two) = lim + J (dxje*O-) O A), (11.174) 


or in terms of the retarded functions. In general the products with Feynman 
boundary conditions are often easier to calculate, since there are theorems for 
their factorization. 


>0 


singe a- erate) —ik(x—x’) , 
Ou (@) ye lim fa (dx )e (Ju(x)Jv(@)). (11.175) 


The D.C. conductivity is given by the œ — O limit of this expression. The 
components of this tensor are shown in table 11.4: The constitutive relations for 
the viscosities are given in eqn. (11.165). From eqn. (6.67) we have 

ôW 


(Tw x) = 5a) (11.176) 
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and 


WOO i 
EJP (x!) = h (Tow &)Too (x )) 


= (Tyo Taa), (11.177) 
where the last line is a consequence of the connectivity of Feynman averaging. 
Note that this relation does not depend on our ability to express W[J“”] in a 
quadratic form analogous to eqn. (6.35). The product on the right hand side 
can be evaluated by expressing 7,,, in terms of the field. The symmetry of the 
energy—momentum tensor implies that 


JE E (11.178) 


and, if the source coupling is to have dimensions of action, J,, must be 
dimensionless. The only object one can construct is therefore 


J” = g”. (11.179) 


Thus, the source term is the trace of the energy-momentum tensor, which 
vanishes when the action is conformally invariant. To express eqn. (11.165) 
in momentum space, we note that Fourier transform of the velocity is the phase 
velocity of the waves, 


UG) dx! d”k ikir, © 
== — e — 
=a Ga E 
d'k pu, œk' 
= a e 11.180 
Ory KE ( ) 
The derivative is given by 
; d'k an, œk'k; 
ajU' =i e EI 11.181 
j iv f Gare 2 ( ) 
Thus, eqn. (11.165) becomes 
4 \., ny, kikjį 
(Tj) =—-(o+ =n | tvesiy — niyo- (11.182) 


Comparing this with eqn. (11.176), we have, for the spatial components, 


4 iKj 
z ¢ T na 8ij8lm — 14 lm = 


1 —ik(x—x’) i 1 
ho (dx) e (Ti (x)T, @ )). (11.183) 


11.9 Example: Radiation pressure 321 
Contracting both sides with g‘/g’” leaves 
4-n 
¿ (w) + TA nw) } = 


lim 
k>0 n? 


: f antemo (x’)). (11.184) 
ha ` 


The two viscosities cannot be separated in this relation, but 7 can be related to 
the diffusion coefficient, which can be calculated separately. Assuming causal 
(retarded relation between field and source), at finite temperature we may use 
eqn. (6.74) to write 


4-—n 
(co — no) | = 
n B 
— eho f - 
lim e [aotroa (11.185) 
n-hw 


The temperature conduction coefficient «x is obtained from eqn. (11.165). 
Following the same procedure as before, we obtain 


: i Coa a 


= =g K (ƏT + T,U'/c*)) 
= ~ig k (kT — Tyo k’ /k’). (11.186) 


Rearranging, we get 


= gojki(1 — e”) ik" (x—x!)y POI Ojea 
G ie e ETATE 
(11.187) 


To summarize, we note a list of properties with their relevant fluctuations and 
conjugate sources. See table 11.5. 


11.9 Example: Radiation pressure 


The fact that the radiation field carries momentum means that light striking a 
material surface will exert a pressure equal to the change in momentum of the 
light there. For a perfectly absorbative surface, the pressure will simply be equal 
to the momentum striking the surface. At a perfectly reflective (elastic) surface, 
the change in momentum is twice the momentum of the incident radiation in 
that the light undergoes a complete change of direction. Standard expressions 
for the radiation pressure are for reflective surfaces. 
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Table 11.5. Fluctuation generators. 


Property Fluctuation Source 
Electromagnetic radiation Ay JH 
Electric current Ji A 
Compressibility No A? 
Temperature current T T” (heat Q) 


The pressure (kinetic energy density) in a relativistic field is thus 
P; = —2 Toi = pic/o (11.188) 


with the factor of two coming from a total reversal in momentum, and o being 
the volume of the uniform system outside the surface. Using the arguments of 
kinetic theory, where the kinetic energy density of a gas with average velocity 
(v) is isotropic in all directions, 


1 1 3 
yml’) = zmo + v + v?) ~ JMV (11.189) 
we write 
1 
P; ~ 3(P). (11.190) 


Thus, the pressure of diffuse radiation on a planar reflective surface is 


2 
P; = -3 To. (11.191) 


Using eqn. (7.88), we may evaluate this, giving: 


2(ExH); 2 5 
PS = ee A (11.192) 
3 c 3 
Exercises 


Although this is not primarily a study book, it is helpful to phrase a few 
outstanding points as problems, to be demonstrated by the reader. 


(1) In the action in eqn. (11.61), add a kinetic term for the potential V (x) 


AS = fæ EOVOV). (11.193) 


(2 


(3 


) 


wa 
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Vary the total action with respect to @(x) to obtain the equation of motion 
for the field. Then vary the action with respect to V (x) and show that this 
leads to the equations 


—O¢+ (m° +V) =0 
1, 
EVE? =0. 


Next show that the addition of this extra field leads to an extra term in the 
energy-momentum tensor, so that 


1 1 1 
Ou = 5 Bu) vb) + 5 OVV) — 500? + VG". (11.194) 
Using the two equations of motion derived above, show that 
"Ou, =O (11.195) 


so that energy conservation is now restored. This problem demonstrates 
that energy conservation can always be restored if one considers all of the 
dynamical pieces in a physical system. It also serves as a reminder that 
fixed potentials such as V (x) are only a convenient approximation to real 
physics. 


Using the explicit form of a Lorentz boost transformation, show that a 
fluid velocity field has an energy-momentum tensor of the form, 


TY = Pg + (P+ H)U"U" /c’. (11.196) 


Start with the following expressions for a spherically symmetrical fluid at 
rest: 


Too = H 
Toi = Tio = 0 
=P (11.197) 


Consider a matter current N” = (N, Nv) = N (x)U” (x). Show that the 
conservation equation 9, N” = 0 may be written 


ONY = [ð; + £], (11.198) 


where £p = U'd; + (0;U'). This is called the Lie derivative. Compare 
this with the derivatives found in section 10.3 and the discussion found in 
section 9.6. See also ref. [111] for more details of this interpretation. 
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(4) By writing the orbital angular momentum operator in the form L; = 


€ijkX} p* and the quantum mechanical commutation relations [x!, p] = 
ind’ in the form €;;.x/ p* = ih, show that 


Li€iim = [x7, Pm] = ih dim, (11.199) 
and thence 
EilmLiLı = ihLm. (11.200) 


Hence show that the angular momentum components satisfy the algebra 
relation 


[Li, Lj] =ih Eijk Lx. (11.201) 


Show that this is the Lie algebra for so(3) and determine the dimension- 
less generators T“ and structure constants fabc in terms of L; and h. 
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Charge and current 


The idea of charge intuitively relates to that of fields and forces. Charge is 
that quality or attribute of matter which determines how it will respond to a 
particular kind of force. It is thus a label which distinguishes forces from one 
another. The familiar charges are: electric charge, which occurs in Maxwell’s 
equations; the mass, which occurs both in the laws of gravitation and inertia; the 
colour charge, which attaches to the strong force; and a variety of other labels, 
such as strangeness, charm, intrinsic spin, chirality, and so on. These attributes 
are referred to collectively as ‘quantum numbers’, though a better name might 
be ‘group numbers’. 

Charge plays the role of a quantity conjugate to the forces which it labels. 
Like all variables which are conjugate to a parameter (energy, momentum etc.) 
charge is a book-keeping parameter which keeps track of a closure or conserva- 
tion principle. It is a currency for the property it represents. This indicates that 
the existence of charge ought to be related to a symmetry or conservation law, 
and indeed this turns out to be the case. An important application of symmetry 
transformations is the identification of conserved ‘charges’, and vice versa. 


12.1 Conserved current and Noether’s theorem 


As seen in section 11.3, the spacetime variation of the action reveals a structure 
which leads to conservation equations in a closed system. The conservation 
equations have the generic form 


ip+V-J=d,J" =0, (12.1) 


for some ‘current’ J“. These are continuity conditions, which follow from the 
action principle (section 2.2.1). One can derive several different, but equally 
valid, continuity equations from the action principle by varying the action with 
respect to appropriate parameters. This is the essence of what is known as 
Noether’s theorem. 
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In practice, one identifies the conservation law ð, J” = 0 for current J, by 
varying the action with respect to a parameter, conjugate to its charge. This 
leads to two terms upon integration by parts: a main term, which vanishes (either 
with the help of the field equations, or by straightforward cancellation), and a 
surface term, which must vanish independently for stationary action 6S = 0. 
The surface term can be written in the form 


ôS = Jawasa =0 (12.2) 


for some J„; then we say that we have discovered a conservation law for the 
current J,, and parameter A. 

This is most easily illustrated with the aid of examples. As a first example, we 
shall use this method to prove that the electric current is conserved for a scalar 
field. We shall set c = A = 1 for simplicity here. The gauged action for a 
complex scalar field is 


S= l (dx) {° (Dto) (D p) + mc oo}. (12.3) 


Consider now a gauge transformation in which ¢ —> e!®¢ġ, and vary the action 
with respect to ds (x): 


ôS = J (dx)h?c? f(D (~ieds ye)" (Dye) 
4+ (Deh) (D" (ieds ep) . (12.4) 


Now using the property (10.41) of the gauge-covariant derivative that the phase 
commutes through it, we have 


ôS = fæ {(D" (—ieds)6)* (Dud) + (Dtp) (D" (Giess)p)} . (12.5) 
We now integrate by parts to remove the derivative from ôs and use the equations 


of motion (— D? + m?)@ = 0 and —(D*? + m”)¢* = 0, which leaves only the 
surface (total derivative) term 


ôS = faso, (12.6) 
where 
J” = ieh’c*(¢*(D"¢) — (D“$)*6). (12.7) 


Eqn. (12.2) can be written 


1 
faos, = const. (12.8) 
c 
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In other words, this quantity is a constant of the motion. Choosing the canonical 
spacelike hyper-surface for ø, eqn. (12.8) has the interpretation 


~ | dodo = CE Q, (12.9) 
where p is the charge density and Q is therefore the total charge. In other words, 
Noether’s theorem tells us that the total charge is conserved by the dynamical 
evolution of the field. 

As a second example, let us consider dynamical variations of the field ô¢. 
Anticipating the discussion of the energy-momentum tensor, we can write 
eqn. (11.43) in the form 


ôS = [wow =0, (12.10) 


where we have defined the ‘current’ as 
J ôd ~ Thy dg = Ouydx". (12.11) 


This is composed of a piece expressed in terms of the canonical field variables, 
implying that canonical momentum is conserved for field dynamics, 


d“T1, = 0, (12.12) 


and there is another piece for the mechanical energy-momentum tensor, the 
parameter is the spacetime displacement ôx”. This argument is usually used to 
infer that the canonical momentum and the energy-momentum tensor, 


dO.) = 0, (12.13) 


are conserved; i.e. the conservation of mechanical energy and momentum. 

If the action is complete, each variation of the action leads to a form which 
can be interpreted as a conservation law. If the action is incomplete, so that 
conservation cannot be maintained with the number of degrees of freedom 
given, then this equation appears as a constraint which restricts the system. In a 
conservative system, the meaning of this equation is that “what goes in goes out’ 
of any region of space. Put another way, in a conservative system, the essence 
of the field cannot simply disappear, it must move around by flowing from one 
place to another. 

Given a conservation law, we can interpret it as a law of conservation of an 
abstract charge. Integrating the conservation law over spacetime, 


fo dnd = fons, = const. (12.14) 
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Table 12.1. Conjugates and generators. 


Q v 
Translation Pi xi 
Time development —H t 
Electric phase e 0 = f A,dx” 


Non-Abelian phase gT° 6% = f A%dx” 


If we choose do“, i.e. u = 0, to be a spacelike hyper-surface (i.e. a surface of 
covariantly constant time), then this defines the total charge of the system: 


Q(t) = f eon) = fex p(x). (12.15) 


Combining eqns. (12.14) and (12.15), we can write 


[ox 3u J” = -a f aop + fios =0. (12.16) 


The integral over J; vanishes since the system is closed, i.e. no current flows in 
or out of the total system. Thus we have (actually by assumption of closure) 


dQ(t) _ 
dt £ 


(12.17) 


This equation is well known in many forms. For the conservation of electric 
charge, it expresses the basic assumption of electromagnetism that charge is 
conserved. In mechanics, we have the equation for conservation of momentum 


dp ak oi =0 (12.18) 
— = — o Q =0. : 
dt dt j 

The conserved charge is formally the generator of the symmetry which leads to 
the conservation rule, i.e. it is the conjugate variable in the group transformation. 
In a group transformation, we always have an object of the form: 

eir, (12.19) 
where Q is the generator of the symmetry and v is the conjugate variable which 
parametrizes the symmetry (see table 12.1). Noether’s theorem is an expression 
of symmetry. It tells us that — if there is a symmetry under variations of a 
parameter in the action — then there is a divergenceless current associated with 
that symmetry and a corresponding conserved charge. The formal statement is: 
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The invariance of the Lagrangian under a one-parameter family 
of transformations implies the existence of a divergenceless current 
and associated conserved ‘charge’. 


Noether’s theorem is not the only approach to finding conserved currents, but 
it is the most well known and widely used [2]. The physical importance of 
conservation laws for dynamics is that 


A local excess of a conserved quantity cannot simply 
disappear —it can only relax by spreading slowly 
over the entire system. 


12.2 Electric current J, for point charges 


Electric current is the rate of flow of charge 
l=—. (12.20) 


Current density (current per unit area, in three spatial dimensions) is a vector, 
proportional to the velocity v of charges and their density p: 


Ji = pev'. (12.21) 


By adding a zeroth component J° = pc, we may write the spacetime-covariant 
form of the current as 


JË = pep", (12.22) 


where 6“ = (c,v). For a point particle at position xọ(t), we may write the 
charge density using a delta function. The n-dimensional spatial delta function 
has the dimensions of density and the charge of the particle is q. The current per 
unit area J’ is simply q multiplied by the velocity of the charge: 


J°/c = p(x) = q & (X —xp(0)) 


dx! 
350) oO 


Relativistically, it is useful to express the current in terms of the velocity vectors 
p” and U”. For a general charge distribution the expressions are 


(12.23) 


J" (x) = pcp" 
= pcy !U". (12.24) 
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Table 12.2. Currents for various fields. 


Field Current 
Point charges, velocity v J? = epc 
J = epy 

Schrödinger field J? = ey*y 


J=iFZ ("Oy - Dy*y) 
Klein-Gordon field J, = ieħ? (ġ* (Dud) — (Du)*) 
Dirac field Ja = iecyyu y 


Thus, for a point particle, 
J” = qep" 8"(x — xp(t)) 
= ge far OG — Xp(t)) B" 


=q fa etx — xp(T))U*. (12.25) 


12.3 Electric current for fields 


The form of the electric current in terms of field variables is different for each 
of the field types, but in each case we may define the current by 


8S 


Diese area 
A, 


(12.26) 
where Sy is the action for matter fields, including their gauge-invariant coupling 
to the Maxwell field A,,, but not including the Maxwell action (eqn. (21.1)) 
itself. The action must be one consisting of complex fields, since the gauge 
symmetry demands invariance under arbitrary complex phase transformations. 
A single-component, non-complex field does not give rise to an electric current. 
The current density for quanta with charge e may be summarized in terms of 
the major fields as seen in table 12.2. The action principle displays the form of 
these currents in a straightforward way and also clarifies the interpretation of the 
source as a current. For example, consider the complex Klein—Gordon field in 
the presence of a source: 


S = Sm + Sj = fæ [A (Dp) (Dup) — J“ Ay} , (12.27) 
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where terms independent of A, have been omitted for simplicity. Using 
eqn. (12.26), and assuming that J, is independent of A,,, one obtains 


sm = ieħ? [Ø*(D 6) — (D"¢)*¢] = J". (12.28) 
u 
Note carefully here: although the left and right hand sides are numerically equal, 
they are not formally identical, since J, was assumed to be independent of A,, 
under the variation, whereas the left hand side is explicitly dependent on A,, 
through the covariant derivative. Sometimes these are confused in the literature 
leading to the following error. 
It is often stated that the coupling for the electromagnetic field to matter can 
be expressed in the form: 


Sm = SmlA,, = 0] + feos, (12.29) 


In other words, the total action can be written as a sum of a matter action 
(omitting A,,, or with partial derivatives instead of covariant derivatives), plus 
a linear source term (which is supposed to make up for the gauge parts in the 
covariant derivatives) plus the Maxwell action. This is incorrect because, for any 
matter action which has quadratic derivatives (all fields except the Dirac field), 
one cannot write the original action as the current multiplying the current, just 
as 


2 d 5 
x“ Æ | —x*) x. (12.30) 
dx 
In our case, 
ôS 
An, 12.31 
zas # (12.31) 


The Dirac field does not suffer from this problem. Given the action plus source 
term, 


> «ft 
S = S&M + Sj = fo |- ngo" Du —y" Du w) , (12,32) 


the variation of the action equals 


OSM. — y 
— = SJ 12.33 
5A, iqcyy“ y ( ) 


In this unique instance the source and current are formally and numerically 
identical, and we may write 


Sm = SmlA, = 0] + J“A,. (12.34) 
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12.4 Requirements for a conserved probability 


According to quantum theory, the probability of finding a particle at a position 
x at time ¢ is derived from an invariant product of the fields. Probabilities must 
be conserved if we are to have a particle theory which makes sense. For the 
Schrödinger wavefunction, this is simply y* w, but this is only true because this 
combination happens to be a conserved density N (x) for the Schrödinger action. 

In order to establish a probability interpretation for other fields, one may use 
Noether’s theorem. In fact, we have already done this. A conserved current is 
known from the previous section: namely the electric current, but there seems to 
be no good reason to require the existence of electric charge in order to be able to 
speak of probabilities. We would therefore like to abstract the invariant structure 
of the conserved quantity without referring specifically to electric charge — after 
all, particles may have several charges, nuclear, electromagnetic etc — any one 
of these should do for counting particle probabilities. 

Rather than looking at local gauge transformations, we therefore turn to 
global phase transformations! and remove the reference in the argument of the 
phase exponential to the electric charge. Consider first the Schrödinger field, 
described by the action 


ee i 
S= [coat -Zopo =- Vy*% + N - vavo) . 
(12.35) 
The variation of the action with respect to constant ôs under a phase transforma- 
tion y > ey is given by 


h? , : 
ôS = J an Sos [ids (0'W*)(0;W) + (0' W*)ids ay) | 


+ i[-idsy*d,v + idsy*d,v] | (12.36) 


Note that the variation ôs need not vanish simply because it is independent of x, 
(see comment at end of section). Integrating by parts and using the equation of 
motion, 


h? ow 
—— Vy? V = i—, 12.37 
2m pE ary ( ) 


we obtain the expression for the continuity equation: 


as = | awas (aF +a’) =0, (12.38) 


' Global gauge transformations are also called rigid since they are fixed over all space. 
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where 
J = y*y =p 
i ih? * i i j% 
Dan [y* 0y) —'w yy], (12.39) 
m 


which can be compared to the current conservation equation eqn. (12.1). p is the 
probability density and J’ is the probability current. The conserved probability, 
by Noether’s theorem, is therefore 


PS [eovrcovw, (12.40) 


and this can be used to define the notion of an inner product between two 
wavefunctions, given by the overlap integral 


(Vi, y2) = J do yi (x) W(x). (12.41) 

Thus we see how the notion of an invariance of the action leads to the 
identification of a conserved probability for the Schrödinger field. 

Consider next the Klein—Gordon field. Here we are effectively doing the same 


thing as before in eqn. (12.4), but keeping s independent of x and setting D,, > 
ð, ande > 1: 


s= f dome [0r] 
6S = J (dx)A7c? [("b* (—ids)e*) ape") + c.c.] 
= J (dx)ôs (8, J"), (12.42) 
where 


J" = -if’c? (p* a" — p3" g“). (12.43) 


The conserved ‘charge’ of this symmetry can now be used as the definition of 
the inner product between fields: 


A J do” (6th — (0sQ1)" $2), (12.44) 


or, in non-covariant form, 


(i, ¢2) = inc | do (6200. — (00¢1)*¢2). (12.45) 


This is now our notion of probability. 
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Here we have shown that a conserved probability can be attributed to any 
complex field as a result of symmetry under rigid (global) phase transformations. 
One should be somewhat wary of the physical meaning of rigid gauge transfor- 
mations, since this implies a notion of correlation over arbitrary distances and 
times (a fact which apparently contradicts the finite speed of communication 
imposed by relativity). Global transformations should probably be regarded as 
an idealized case. In general, one requires the notion of a charge and associated 
gauge field, but not necessarily the electromagnetic gauge field. An additional 
point is: does it make physical sense to vary an object which does not depend on 
any dynamical variables x, t? How should it vary without any explicit freedom 
to do so? These points could make one view rigid (global) gauge transformations 
with a certain skepticism. 


12.5 Real fields 


A cursory glance at the expressions for the electric current show that J,, vanishes 
for real fields. Formally this is because the gauge (phase) symmetry cannot exist 
for real fields, since the phase is always fixed at zero. Consequently, there is no 
conserved current for real fields (though the energy-momentum tensor is still 
conserved). In the second-quantized theory of real fields (which includes the 
photon field), this has the additional effect that the number of particles with a 
given momentum is not conserved. 


The problem is usually resolved in the second-quantized theory by distin- 
guishing between excitations of the field (particles) with positive energy and 
those with negative energy. Since the relativistic energy equation E? = pc” + 
m?c* admits both possibilities. We do this by writing the real field as a sum of 
two parts: 


p= +40, (12.46) 
where 6* = p. o is a complex quantity, but the sum 6 + © is 


clearly real. What this means is that it is possible to define a conserved current 
and therefore an inner product on the manifold of positive energy solutions ¢, 


P, os) = iħc J do” (3, Ga _ (GP )*O), (12.47) 


and another on the manifold of negative energy solutions ¢~). Thus there is 
local conservation of probability (though charge still does not make any sense) 
of particles and anti-particles separately. 
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12.6 Super-conductivity 


Consider a charged particle in a uniform electric field E;. The force on the 
particle leads to an acceleration: 


q Ei = mx’. (12.48) 


Assuming that the particle starts initially from rest, and is free of other 
influences, at time ¢ it has the velocity 


t 
“(= 4 f E' dt’. (12.49) 
m Jo 


This movement of charge represents a current (charge multiplied by velocity). 
If one considers N such identical charges, then the current is 


. - Nf, 
Jit) = Nqx' = yf E' dt’. (12.50) 
m Jo 
Assuming, for simplicity, that the electric field is constant, at time t one has 
, Nat _. 
ras ei 
m 
=oE'. (12.51) 


The last line is Ohm’s law, V = IR, re-written in terms of the current density 
J‘ and the reciprocal resistance, or conductivity o = 1/R. This shows that a 
free charge has an ohmic conductivity which is proportional to time. It tends to 
infinity. Free charges are super-conducting. 

The classical theory of ohmic resistance assumes that charges are scattered 
by the lattice through which they pass. After a mean free time of t, which is a 
constant for a given material under a given set of thermodynamical conditions, 
the conductivity is © = Nq?t/m. This relation assumes hidden dissipation, 
and thus can never emerge naturally from a fundamental formulation, without 
modelling the effect of collisions as a transport problem. Fundamentally, all 
charges super-conduct, unless they are scattered by some impedance. The 
methods of linear response theory may be used for this. 

If one chooses a gauge in which the electric field may be written 


Ei = -3,A', (12.52) 
then substitution into eqn. (12.50) gives 
Ji =A Al, (12.53) 


where A = Nq?/m. This is known as London’s equation, and was originally 
written down as a phenomenological description of super-conductivity. 
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The classical model of super-conductivity seems naive in a modern, quantum 
age. However, the quantum version is scarcely more sophisticated. As noted 
in ref. [135], the appearance of super-conductivity is a result only of symmetry 
properties of super-conducting materials at low temperature, not of the detailed 
mechanism which gives rise to those symmetry properties. 

Super-conductivity arises because of an ordered state of the field in which the 
inhomogeneities of scattering centres of the super-conducting material become 
invisible to the average state. Consider such a state in a scalar field. The super- 
conducting state is one of great uniformity, characterized by 


ðu (b(x)) = (Ao) = 9. (12.54) 


The average value of the field is thus locked in a special gauge. In this state, the 
average value of the current is given by 


(Jj) = (iehc*(* (Dud) — (Dy)*))- (12.55) 
The time derivative of this is: 
ð (J;) = —e7c7d,(Aj) 
= ec (Ej). (12.56) 


This is the same equation found for the classical case above. For constant 
external electric field, it leads to a current which increases linearly with time, 
i.e. it becomes infinite for infinite time. This corresponds to infinite conductivity. 
Observe that the result applies to statistical averages of the fields, in the same 
way that spontaneous symmetry breaking applies to statistical averages of the 
field, not individual fluctuations (see section 10.7). The individual fluctuations 
about the ground state continue to probe all aspects of the theory, but these are 
only jitterings about an energetically favourable super-conducting mean field. 
The details of how the uniform state becomes energetically favourable require, 
of course, a microscopic theory for their explanation. This is given by the 
BCS theory of super-conductivity [5] for conventional super-conductors. More 
recently, unusual materials have given rise to super-conductivity at unusually 
high temperatures, where an alternative explanation is required. 


12.7 Duality, point charges and monopoles 
In covariant notation, Maxwell’s equations are written in the form 
pF” = — uoJ” 
Eup ð = 0. (12.57) 
If one defines the dual F* of a tensor F by one-half its product with the anti- 


symmetric tensor, one may write 


E ET a (12.58) 
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and Maxwell’s equations become 


3p F” = —poJ” 
a,F**” =0. (12.59) 


The similarity between these two equations has prompted some to speculate as 
to whether a dual current, J“, could not exist: 


3p FH” = — mJ” 
0, FY = — uoJ}. (12.60) 


This would imply an equation of the form 
V -B = (iB) = Hopm (12.61) 


and the existence of magnetic monopoles. The right hand side of these equations 
is usually thought of as a source term, or forcing term, for the differential terms 
on the left hand side. The existence of pointlike singularities is an interesting 
issue, since it touches the limits of the smooth differential formalism used to 
express the theory of electromagnetism and drives home the reasoning behind 
the model of pointlike charges which physicists have adopted. 

Consider a Coulomb field surrounding a point. Up to a factor of 47réo, the 
electric field has the vectorial form 

Xi 

_ |x|” ? 


(12.62) 


i 


in n dimensions. When n = 3 we have m = 3 for the Coulomb field, i.e. a 1/r? 
force law. The derivative of this field is 


= @ —m ri J (12.63) 
` X*Xk 
From this, we have that 
Egg CO 
|x|” 
(V x E): = €ijk 0; Ej =0. (12.64) 


The last result follows entirely from the symmetry on the indices: the product 
of a symmetric matrix and an anti-symmetric matrix is zero. What we see is 
that, in n > 2 dimensions, we can find a solution n = m where the field 
satisfies the equation of motion identically, except at the singularity x; = 0, 
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where the solution does not exist. In other words, a field can exist without a 
source, everywhere except at the singular point. 

In fact, this is an illusion of the differential formulation of Maxwell’s 
equations; it highlights a conceptual difficulty. The core difficulty is that the 
equations are really non-local, in the sense that they relate a field at one point to 
a source at another. This requires an integration over the intermediate points to 
be well defined differentially. The differential form of Maxwell’s equations is 
really a shorthand for the integral procedure. 

At the singular point, the derivative does not exist, and Maxwell’s equation 
becomes meaningless. We can assign a formal meaning to the differential form 
and do slightly better, as it turns out, by using the potential A,,, since this can 
be regularized choosing variables in which the singularity disappears. In that 
way we can assign a formal meaning to the field around a point and justify 
the introduction of a source for the field surrounding the singularity using an 
integral formulation. The formulation we are looking for is in terms of Green 
functions. Green functions are, in a sense, a regularization scheme for defining 
the meaning of an ambiguous, irregular (infinite) expression. This is also the 
first in a long litany of cases where it is necessary to regularize, or re-formulate 
infinite, badly defined expressions in the physics of fields, which result from 
assumptions about pointlike structure and Green functions. 

In terms of the vector potential A,,, choosing the so-called Coulomb gauge 
0, A’ = 0, we have 


E;/c = —0d9A; = 0; Ao, (12.65) 
so that the divergence of the electric field is 
3E! = -—V’°o = p. (12.66) 


Note that we set €o = 1 for the purpose of this schematic. The charge density 
for a point particle with charge q at the origin is written as 


p =q (x) 
= qô (x) (y)ô (z) 


ee 
= 5780"). (12.67) 


Thus, in polar coordinates, about the origin, 


qo(r) 
4T €or : 


-V° o(r) = (12.68) 


The Green function G(x, x’) is defined as the object which satisfies the equation 


—V? G(x, x’) = 8(x — x’). (12.69) 
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If we compare this definition to the Poisson equation for the potential @(x) in 
eqn. (12.68), we see that G(x, x’) can interpreted as the scalar potential for a 
delta-function source at x = x’, with unit charge. Without repeating the content 
of chapter 5, we can simply note the steps in understanding the singularity at the 
origin. In the case of the Coulomb potential in three dimensions, the answer is 
well known: 


o(r)= =a (12.70) 
4rnr 


We can use this to verify the consistency of the Green function definition of 
the field, in lieu of a more proper treatment later. By multiplying the Poisson 
equation by the Green function, one has 


J dx (V° (x))G (x, x’) = J dx’ p(x')G(x, x’). (12.71) 
Integrating by parts, and using the definition of G(x, x’), 


d(x) = [x pG), (12.72) 
Substituting the polar coordinate forms for @(r) and using the fact that G(r, r^) 
is just ġ (r — r’) in this instance, we have 


Le 1 5(r’) 
MEM A4nr J An (r — r’) Arr’? i 


This equation is self-consistent and avoids the singular nature of the r’ integra- 


ar’ dr. (12.73) 


tion by virtue of cancellations with the integration measure f d?x’ = Anr’*dr. 
We note that both the potential and the field are still singular at the origin. 
What we have achieved here, however, is to show that the singularity is 
related to a delta-function source (well defined under integration). Without 
the delta-function source p, the only consistent solution is ¢ = const. in the 
equation above. Thus we do, in fact, need the source to explain the central 
Coulomb field. 

In fact, the singular structure noted here is a general feature of central fields, 
or conservative fields, whose curl vanishes. A non-vanishing curl, incidentally, 
indicates the presence of a magnetic field, and thus requires a source for the 
magnetism, or a magnetic monopole. 

The argument for magnetic monopoles is based on the symmetry of the 
differential formulation of Maxwell’s equations. We should pay attention to 
the singular nature of pointlike sources when considering this point. If we view 
everything in terms of singularities, then a magnetic monopole exists trivially: 
it is the Lorentz boost of a point charge, i.e. a string of current. The existence 
of other monopoles can be inferred from other topological singularities in the 
spacetime occupied by the field. 


13 


The non-relativistic limit 


In some branches of physics, such as condensed matter and quantum optics, one 
deals exclusively with non-relativistic models. However, there are occasionally 
advantages to using a relativistic formulation in quantum theory; by embedding 
a theory in a larger framework, one often obtains new insights. It is therefore 
useful to be able to take the non-relativistic limit of generally covariant theories, 
both as an indication of how large or small relativistic effects are and as a cultural 
bridge between covariant physics and non-relativistic quantum theory. 


13.1 Particles and anti-particles 


There is no unified theory of particles and anti-particles in the non-relativistic 
field theory. Formally there are two separate theories. When we take the 
non-relativistic limit of a relativistic theory, it splits into two disjoint theories: 
one for particles, with only positive definite energies, and one for anti-particles, 
with only negative definite energies. Thus, a non-relativistic theory cannot 
describe the interaction between matter and anti-matter. 

The Green functions and fields reflect this feature. The positive frequency 
Wightman function goes into the positive energy particle theory, while the nega- 
tive frequency Wightman function goes into the negative energy anti-particle 
theory. The objects which one then refers to as the Wightman functions 
of the non-relativistic field theory are asymmetrical. In normal Schrédinger 
field theory for matter, one says that the zero temperature negative frequency 
Wightman function is zero.! 


' At finite temperature it must have a contribution from the heat bath for consistency. 
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13.2 Klein—Gordon field 
13.2.1 The free scalar field 


We begin by considering the Klein—Gordon action for a real scalar field, since 
this is the simplest of the cases and can be treated at the level of the action. It also 
reveals several subtleties in the way quantities are defined and the names various 
quantities go by. In particular, we must recall that relativistic theories have an 
indefinite metric, while non-relativistic theories can be thought of as having a 
Euclidean, definite metric. Since one is often interested in the non-relativistic 
limit in connection with atomic systems, we illustrate the emergence of atomic 
levels by taking a two-component scalar field, in which the components have 
different potential energy in the centre of mass frame of the field. This is 
incorporated by adopting an effective mass m4 = m + E,/c’. 
Consider the action: 


1 1 
S= J (dx) {SMC .0) + smic*onos| (13.1) 


The variation of our action, with respect to the atomic variables, leads to 


ôS = Jansone + mc) HRe | dotada) (13.2) 


The vanishing of the first term leads to the field equation 


2 2 
We (- +s Jaw =0. (13.3) 


The second (surface) term in this expression shows that any conserved proba- 
bility must transform like an object of the form ġ43 ġa. In fact, the real scalar 
field has no conserved current from which to derive a notion of locally conserved 
probability, but we may note the following. Any complex scalar field g has a 
conserved current, which allows one to define the inner product 


E J e E T (13.4) 


where do! is the volume element on a spacelike hyper-surface through space- 
time. This result is central even to the real scalar field, since a real scalar field 
does have a well defined probability density in the non-relativistic limit. To see 
this, we observe that the real scalar field (x) may be decomposed into positive 
and negative frequency parts: 


pax) = pP a) + HX), (13.5) 


where $‘*? (x) is the positive frequency part of the field, 6 (x) is the negative 
frequency part of the field and pP (x) = (@~(x))*. Since the Schrödinger 
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equation has no physical negative energy solutions, one must discard the 
negative frequency half of the spectrum when reducing the Klein—Gordon field 
to a Schrödinger field. This leads to well known expressions for the probability 
density. Starting with the probability 


p= 2ine f do. (6200) (13.6) 
and letting 
Wz) — YO eae 13.7 
p(x) ana p(x) ae (13.7) 
one obtains 
ih 
p= = J do, (y + Y*)do (y + y*). (13.8) 
mc 


Assuming only that @(x) may be expanded in a complete set of plane waves 
exp(ik - x — œt), satisfying the free wave equation f?@? = f?k*c? + mct, 
then in the non-relativistic limit 47k? «< m?c*, we may make the effective 
replacement iño —> mc to lowest order. Thus we have 


p= fioro, (13.9) 


which is the familiar result for non-relativistic particles. It is easy to check that 
p is a dimensionless quantity using our conventions. 
This observation prompts us to define the invariant inner product of two fields 


pa and ġg by 
1 
(ġa, $g) = ihc J dos 5 (bobs — (0004) os): (13.10) 


The complex conjugate symbol is only a reminder here of how to take the non- 
relativistic limit, since @, is real. This product vanishes unless A 4 B, thus it 
must represent an amplitude to make a transition from ¢, to ¢2 or vice versa. 
The non-relativistic limit of this expression is 


1 
(1,68) > 5 J doz [Vive + viva). (13.11) 


Since w(x) is the field theoretical destruction operator and y* (x) is the creation 
operator, this is now manifestly a transition matrix, annihilating a lower state 
and creating an upper state or vice versa. The apparent A, B symmetry of 
eqn. (13.11) is a feature only of the lowest order term. Higher order corrections 
to this expression are proportional to E,; — E>, the energy difference between 
the two field levels, owing to the presence of do. 
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Using the interaction term P, we may compute the non-relativistic limit of the 
action in eqn. (13.1). This procedure is unambiguous only up to re-definitions 
of the origin for the arbitrary energy scale. Equivalently, we are free to define 
the mass used to scale the fields in any convenient way. The simplest procedure 
is to re-scale the fields by the true atomic mass, as in eqn. (13.7). In addition, we 
note that the non-relativistic energy operator ifid, is related to the non-relativistic 
energy operator ifi a, by a shift with respect to the rest energy of particles: 


ind, = mc? + ihd,. (13.12) 


This is because the non-relativistic Hamiltonian does not include the rest energy 
of particles, its zero point is shifted so as to begin just about the rest energy. 
Integrating the kinetic term by parts so that (3 p)? > $ (—O )¢ and substituting 
eqn. (13.7) into eqn. (13.1) gives 


S fa day + ae ind, + 2 
= x — — 1 
g 2 A) omc? t! dme? 
h2 
+E,—-—V swt ya. (13.13) 
2m 


If we use the fact that yr, (x) is composed of only positive plane wave frequen- 
cies, it follows that terms involving Y? or (y*)* vanish since they involve delta 
functions imposing a non-satisfiable condition on the energy 5(mc”+fa@), where 
both m and @ are greater than zero. This assumption ceases to be true only if 
there is an explicit time dependence in the action, indicating a non-equilibrium 
scenario, or if the mass of the atoms goes to zero (in which case the NR limit is 
unphysical). We are therefore left with 


Syr = lim J do, dt È (viva) = viva) = viuva) ! 
(13.14) 


where the differential operator H4 is defined by 


2 


H, = y +E, + : (E> + 7) (13.15) 
A~ 2m D Tea A ee ` 


and we have re-defined the action by a sign in passing to a Euclideanized non- 
relativistic metric. It is now clear that, in the NR limit c —> oœ, the final two 
terms in H4 become negligible, leading to the field equation 


HaWa(x) = ihd, Waa), (13.16) 


which is the Schrödinger equation of a particle of mass m moving in a constant 
potential of energy E4 with a dipole interaction. The fact that it is possible to 


344 13 The non-relativistic limit 


identify what is manifestly the Hamiltonian H in such an easy way is a special 
property of theories which are linear in the time derivative. 

The direct use of the action (a non-physical quantity) in this way requires 
some care, so it is useful to confirm the above derivation with an approach 
based on the field equations, which are physical. As an additional spice, we 
also choose to scale the two components of the field by a factor involving the 
effective mass m, rather than the true atomic mass m. The two fields are then 
scaled differently. This illustrates another viewpoint, namely of the particles 
as two species with a truly different mass, as would be natural in particle 
physics. We show that the resulting field equations have the same form in the 
non-relativistic limit, up to a shift in the arbitrary zero point energy. 

Starting from eqn. (13.3), we define new pseudo-canonical variables by 


WA i. 
Pa = > @ + én) 


1 R 
Qa = Ta g = ~én) (13.17) 


where iw, —> mac? in the non-relativistic limit, and the time dependence of 
the fields is of the form of a plane wave exp(—iwyt), for w, > 0. This is the 
same assumption that was made earlier. We note that, owing to this assumption, 
the field P4(x) becomes large compared with Q 4(x) in this limit. Substituting 
this transformation into the field equation (13.3) and neglecting Q, one obtains 


h? 1 
ifd, Pa = AA + mac Pa. (13.18) 
A 


These terms have a natural physical interpretation: the first term on the right 
hand side is the particle kinetic term for the excited and unexcited atoms in our 
system. The second term is the energy offset of the two levels in the atomic 
system. 

Our new point of view now leads to a free particle kinetic term with a mass 
ma, rather than the true atomic mass m. There is no contradiction here, since 
E4 is small compared to mc’, so we can always expand the reciprocal mass to 
first order. Expanding these reciprocal masses m 4 we obtain 


m3! =m-'+0/ = > 0) (13.19) 
A m?c? 
showing that a consistent NR limit requires us to drop the A-dependent pieces. 
Eqn. (13.18) may then be compared with eqn. (13.16). It differs only by a 
shift in the energy. A shift by the average energy level I(E 1 + E2) makes these 
equations identical. 
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13.2.2 Non-relativistic limit of Gg(x, x’) 


As we have already indicated, the non-relativistic theory contains only positive 
energy solutions. We also noted in section 5.5 that the Schrodinger Green 
function Gyur(x, x’) satisfied purely retarded boundary conditions. There was 
no Feynman Green function for the non-relativistic field. Formally, this is a 
direct result of the lack of negative energy solutions to the Schrédinger equation 
(or anti-particles, in the language of quantum field theory). We shall now show 
that object, which we refer to as the Feynman Green function, becomes the 
non-relativistic retarded Green function in the limit c — oo. The same argument 
applies to the relativistic retarded function, and it is clear from eqn. (5.74) that 
the reason is the vanishing of the negative frequency Wightman function in the 
non-relativistic limit. 
We begin with eqn. (5.95) and reinstate c and fi: 


dtl c eikAx 
Omt! 2ha, hc 
: : (13.20) 
(chko + hw, EE ie) (cħko = hw, + ie) : ` 


In order to compare the relativistic and non-relativistic Green functions, we have 
to re-scale the relativistic function by the rest energy, as in eqn. (13.7), since the 
two objects have different dimensions. Let 


G(x, x) =c 


2mc? Gp(x, x’) > GFNR, (13.21) 
so that the dimensions of Gp ng are the same as those for Gyr: 
he 1 2 1 1 1 
ERES + -mc Gr nR = (X, X )ô (t, t) = cê (x, x); 
2m 2 j 

h2 

-57 — ina, Gyr = 8X, x’) S(t, t^). (13.22) 
m 


Next, we must express the relativistic energy iw in terms of the non-relativistic 
energy fi@ and examine the definition of œ with c reinstated, 


k= = yp me 
CKyp = —0 = (62) ji 


ha, = y h?c2k? + m2c4. (13.23) 


The change of ky > —a@/c, both in the integral limits and the measure, means 
that we effectively replace dky — da@/c. In the non-relativistic limit of large c, 
the square-root in the preceding equation can be expanded using the binomial 
theorem, 
> Ak? 1 
ha, = mc + —— +O . (13.24) 
2m 


c2 
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Substituting these results into eqn. (13.20), we have for the partial fractions 


1 1 
chky + ha, — ie E hie 

: = : (13.25) 
chky — hax +ie -E _ ng — Ime? + ie 


while the pre-factor becomes 


gm (a OE E r (13.26) 

© = = ; : 
2h, 2m2c? ct 

Taking the limit c —> œ in these expressions causes the second partial fraction 


in eqn. (13.25) to vanish. This is what removes the negative energy solutions 
from the non-relativistic theory. The remainder may now be written as 


d'k d@ (PK _ 
Genre, xX) = oer Ne a ay * (13.27) 


We see that this is precisely the expression obtained in eqn. (5.140). It has poles 
in the lower half-plane for positive frequencies. It is therefore a retarded Green 
function and satisfies a Kramers—Kronig relation. 


13.3 Dirac field 


The non-relativistic limit of the Dirac equation is more subtle than that for scalar 
particles since the fields are spinors and the y-matrices imply a constraint on 
the components of the spinors. There are several derivations of this limit in 
the literature, all of them at the level of the field equations. Here we base our 
approach, as usual, on the action and avoid introducing specific solutions or 
making assumptions about their normalization. 


13.3.1 The free Dirac field 


The Dirac action may be written 
= 1. —> <t 2 
Sp = fow =g” On y" ðu +M ) Y. (13.28) 


We begin by re-writing this in terms of the two-component spinors x (see 
chapter 20) and with non-symmetrical derivatives for simplicity. The latter 
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choice is of no consequence and only aids notational simplicity: 
Sp = [ceowty Cincy", +me?)y 
_ + 4 —ihd, — mc? —iħco' ð; Xi 
a. Jai X2) ( —ifico'd; —ihad, + mc? x2 g 
(13.29) 


This block matrix can be diagonalized by a unitary transformation. The 
eigenvalue equation is 


(—iħð, — mc? — A)(—ihd, + me? — A) + h?c?o'o/ að; =0. (13.30) 
Noting that 
o'o/0,0; = 0'0; + ie! 0,9 jo5, (13.31) 


the eigenvalues may be written as 


=> 


+= —iho, E Jmc! = ħ?c? (83; + ieijk 9,0 jok). (13.32) 


Thus, the action takes on a block-diagonal form 


= f OF vineva, 4+ me) 


= Jawai D ( p = ) ( = i (13.33) 


In the non-relativistic limit, c —> oo, we may expand the square-root in the 

eigenvalues 

A? (at; + iett 9,0 jor) 
2m?c? 


Ae = —ihd, + mc? (1 — +O) +- ) ; 


(13.34) 


The final step is to re-define the energy operator by the rest energy of the field, 
for consistency with the non-relativistic definitions: 
292 


2m?c? 


Ae = —ihd, — me +m? (1 +0(e“%)+4+-- ) . (13.35) 


Thus, in the limit, c —> oo, the two eigenvalues, corresponding to positive and 
negative energy, give 


ae PV 
À+ = —iho, = 
2m 


A_ = 00. (13.36) 
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Apart from an infinite contribution to the zero point energy which may be re- 
defined (renormalized) away, and making an overall change of sign as in the 
Klein—Gordon case, the non-relativistic action is 


fas ON 
Sp > fæ È (ind + ) x ; (13.37) 
2m 


13.3.2 The Dirac Green function 


The non-relativistic limit of the Dirac Green function may be inferred quite 
straightforwardly from the Green function for the scalar field. The Dirac Green 
function S(x, x’) satisfies the relation 


(—iħcy”ð, + mc?)S(x, x’) = cd(x, x’). (13.38) 


We also know that the squared operator in this equation leads to a Klein—Gordon 
operator, thus 


(ihey"d,, + mc’) S(x, x’) = G(x, x’), (13.39) 


so operating on eqn. (13.38) with this conjugate operator leaves us with 


(A770 +m AG, x’) = c8(x, x’). (13.40) 


Both sides of this equation are proportional to a spinor identity matrix, which 
therefore cancels, leaving a scalar equation. Since we know the limiting 
properties of G(x, x’) from section 13.2.2, we may take the limit by introducing 
unity in the form 2mc?/2mc?, such that 2mc?G(x, x’) = Gyr(x, x’) and the 
operator in front is divided by 2mc?. After re-defining the energy operator, as in 
eqn. (13.12), the limit of c —> œœ causes the quadratic time derivative to vanish, 
leaving 


h? J 
-5v = ind Gnr(x, x’) = 8 (x, x’)d(t, t’). (13.41) 
m 


This is the scalar Schrödinger Green function relation. To get the Green 
function for the two-component spinors found in the preceding section, it may 
be multiplied by a two-component identity matrix. 


13.3.3 Spinor electrodynamics 


The interaction between electrons and radiation complicates the simple proce- 
dure outlined in the previous section. The minimal coupling to radiation via the 
gauge potential A,,(x) involves x-dependence, which means that the derivatives 
do not automatically commute with the diagonalization procedure. We must 
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therefore modify the discussion to account for this, in particular taking more 
care with time reversal invariance. In addition, we must consider the reaction 
of the electronic matter to the presence of an electromagnetic field. This leads 
to a polarization of the field, or effective refractive index (see section 21.2 for a 
simple discussion of classical polarization). The action for electrodynamics is 
thus 


=e. 1. > eae i 1 
SQED = fools (zreo Du =y" Du tme) y + act Gro} 
(13.42) 


where G,,, is the covariant displacement field, defined in eqn. (21.62). We 
proceed once again by re-writing this in terms of the two-component spinors 
x. We consider the matter and radiation terms separately. The matter action is 
given by 


Sy = | ayy iney" Dy + me? 


a > 2 . i 
—15 D; —mc —ihco' Di 
= [awi sae Pane ea) 
—ihco'D; = —i5 D; +mc X2 
(13.43) 


In electrodynamics, the covariant derivative is D, = 0, + iğ Ap, from which it 
follows that 


Die D] = ia Fu. (13.44) 


The block matrix in eqn. (13.43) can be diagonalized by a unitary transforma- 
tion. The symmetrized eigenvalue equation is 


(-i; D, -mc — a) (-i; D: +mc* — a) + h°c?o'a! DiD; =0, 
(13.45) 
or 
A2 + 2inAD, + hc? oio? DiD; — h? D? — m2c4 — i (ar)=0, 
(13.46) 


where the last term arises from the fact that the eigenvalues themselves depend 
on x due to the gauge field. It is important that this eigenvalue equation be 
time-symmetrical, as indicated by the arrows. We may write this in the form 


A= —i5 D: +,/m?c* — h°c?a0'oi DiD; + is (0;A) (13.47) 
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and we now have an implicit equation for the positive and negative energy roots 
of the operator à. The fact that the derivative term 0,A is a factor of c? smaller 
than the other terms in the square-root means that this contribution will always 
be smaller than the others. In the strict non-relativistic limit c — œ it is 
completely negligible. Since the square-root contains operators, we represent 
it by its binomial expansion 

nin—1) , 


x +... 


d+x)" =1+nx+ 5 : (13.48) 


after extracting an overall factor of mc’, thus: 


A a 2 ho'o) DiD; ht (c'o/D; Dj)” 


2m 8m3c2 


h <> 
i A) +--+ |. 13.49 
Ame SO | 222 


The final term, 0;, can be evaluated to first order by iterating this expression. 
Symmetrizing over time derivatives, the first order derivative of eqn. (13.49) is 
e 0) af 2 
(0,4) = F500" (Di ð; Dj) 
2m 
ieħ ;, 
= F—a'o! (Di E; — E;D;) (13.50) 
2m 


since we may add and subtract ð; A; with impunity. To go to next order, we must 
substitute this result back into eqn. (13.49) and take the time derivative again. 
This gives a further correction 


< © h ih (ieh , , 
(aA) = Fi a oio’ (DiE; — E;Dj) (13.51) 


4mc2 ‘| 4mc2 \ 2m 


Noting the energy shift —ifid, > —ihd, — mc? and taking the positive square- 
root, we obtain the non-relativistic limit for the positive half of the solutions: 
fh D'D; — ehB'o; 

2m 2m 


Sp > feol (ind, + 


2 


ar amin? y | (Di Ej — E; Dj) 


ht ; e 2 
gme (wn) = ZiB) 


l gig) 3, DD BD )x l (13.52) 


3 


EF 3¢4 
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where Bg = léi jkFjęg and the overall sign has been changed to conform to the 
usual conventions. The negative root in eqn. (13.47) gives a similar result for 
anti-particles. The fifth term contains 3! E;, which is called the Darwin term. 
It corresponds to a correction to the point charge interaction due to the fact 
that Dirac ‘particles’ are spread out over a region with radius of the order 
of the Compton wavelength fi/mc. In older texts, this is referred to as the 
Zitterbewegung, since, if one insists on a particle interpretation, it is necessary to 
imagine the particles jittering around their average position in a kind of random 
walk. The o/B; term is a Zeeman splitting term due to the interaction of the 
magnetic field with particle trajectories. 

Note that our diagonalization of the Dirac action leads to no coupling between 
the positive and negative energy solutions. One might expect that interactions 
with A, which couple indiscriminately with both positive and negative energy 
parts of the field would lead to an implicit coupling between positive and 
negative energy parts. This is not the case classically, however, since the vector 
potential A,, leads to no non-linearities with respect to y. 

Radiative corrections (fluctuation corrections) in the relativistic fields give 
rise to back-reaction terms both in the fermion sector and in the electromagnetic 
sector. The effect of photon D,,, exchange leads to an effective quartic 
interaction 


Sn = [ener WO)" Wo’) Dave, x) WYE). (13.53) 


The photon propagator is clearly a non-local and gauge-dependent quantity. 
Non-locality is a feature of the full theory, and reflects the fact that the finite 
speed of light disallows an instantaneous response in the field during collisions 
(there is an intrinsic non-elasticity in relativistic particle scattering). Working to 
a limited order in 1 /c makes the effective Lagrangian effectively local, however, 
since the non-local derivative expansion is truncated. The gauge dependence of 
the Lagrangian is more subtle. In order to obtain a physically meaningful result, 
one requires an effective Lagrangian which produces gauge-fixing independent 
results. This does not necessarily mean that the Lagrangian needs to be 
gauge-independent, however. The reason is that the Lagrangian is no longer 
covariant with respect to the necessary symmetries to make this apparent. 
Gauge invariance is related to a conformal/Lorentz symmetry of the relativis- 
tic gauge field, so one would expect a loss of Lorentz invariance to result in 
a breakdown of invariance under choice of gauge-fixing condition. In fact, a 
non-relativistic effective Lagrangian is not unique: its form is indeed gauge- 
dependent. Physical results cannot be gauge-dependent, however, provided one 
works to consistent order in the expansion of the original covariant theory. Thus, 
the gauge condition independence of the theory will be secured by working 
to consistent order in the smallness parameters, regardless of the actual gauge 
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chosen. Propagators and Lagrangian are gauge-dependent, but in just the right 
way to provide total gauge independence. 

Turning to the photon sector, we seek to account for the effects of vacuum and 
medium polarization in leading order kinetic terms for the photon. To obtain the 
non-relativistic limit of the radiation terms, it is advantageous to have a model of 
the dielectric medium in which photons propagate. Nevertheless, some progress 
can be made on the basis of a generic linear response. We therefore use linear 
response theory and assume a constitutive relation for the polarization of the 
form 


GY? = Fe fo Ox LE (13.54) 


The second term is a correction to the local field, which is proportional to the 
field itself. Perhaps surprisingly, this relation plays a role even in the vacuum, 
since quantum field theory predicts that the field y may be polarized by the 
back-reaction of field fluctuations in A,. Since the susceptibility x (x, x’) 
depends on the dynamics of the non-relativistic matter field, one expects this 
polarization to break the Lorentz invariance of the radiation term. This occurs 
because, at non-relativistic speeds, the interaction between matter and radiation 
splits into electric and magnetic parts which behave quite differently. From 
classical polarization theory, we find that the momentum space expression for 
the susceptibility takes the general form 
22 
One ees (13.55) 
— iyo +o? 

In an electron plasma, where iae are no atoms which introduce interactions 
over and above the ones we are considering above, the natural frequency 
of oscillations can only be w) ~ mc?/ħ. These are the only scales from 
which to construct a frequency. The significance of this value arises from the 
correlations of the fields on the order of the Compton wavelength which lead 
to an elastic property of the field. This is related to the Casimir effect and to 
the Zitterbewegung mentioned earlier. It is sufficient to note that such a system 
has an ultra-violet resonance, where wo >> œw in the non-relativistic limit. This 
means that x(w) can be expanded in powers of w/o. From the equations of 
motion, iw ~ h*k*/2m; thus, the expansion is in powers of the quantity 


@ AK /2m hk? 
wo  h(mce?/h) me? 


It follows that the action for the radiation may be written in the generic form 


ws 2 
w= feof alee) 


2 
4 oe z a| Vey taap (Ert) art. (13.57) 


(13.56) 
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This form expresses only the symmetries of the field and dimensional scales of 
the system. In order to evaluate the constants Cg and Cg in this expression, it 
would necessary to be much more specific about the nature of the polarization. 
For a plasma in a vacuum, the constants are equal to unity in the classical 
approximation. The same form for the action would apply in the case of 
electrons in an ambient polarizable medium, below resonance. Again, to 
determine the constants in that case, one would have to introduce a model for 
the ambient matter or input effective values for the constants by hand. 


13.4 Thermal and Euclidean Green functions 


There are two common formulations of thermal Green functions. At thermal 
equilibrium, where time is an irrelevant variable on average, one can rotate to a 
Euclidean, imaginary time formulation, as in eqn. (6.46), where the imaginary 
part of time places the role of an inverse temperature 6. Alternatively one can 
use a real-time formulation as in eqn. (6.61). 

The non-relativistic limit of Euclideanized field theory is essentially no 
different from the limit in Minkowski spacetime, except that there is no direct 
concept of retarded or advanced boundary conditions in terms of poles in the 
propagator. There is nevertheless still a duplicity in the solutions with positive 
and negative, imaginary energy. This duplicity disappears in the non-relativistic 
limit, as before, since half of the spectrum is suppressed. The relativistic, 
Euclidean Green function, closely related to the zero-temperature Feynman 
Green function, is given by 


Gp(x,x') = J Dea (13.58) 
BRETT Qn (27)” pee arc 


where the zeroth component of the momentum is given by the Matsubara 
frequencies Pp = 2nr/Bhc: 


5 do d'k  elk@-*) 
2mc* Gg(x, x) = | ————_ ———__. (13.59) 
f Qn (27) r3 
E+ me? 


Shifting the energy i Pp > me +i Pp leaves us with 


Gk N J dw d'k  eikG-2) (13.60) 
X,x)= : j 
BR O ET 
which is the Green function for the Euclidean action 
wv? .. 
S= fox | + n| X. (13.61) 
2m 
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In the real-time formulation, in which we retain the auxiliary time dependence, 
the thermal character of the Green functions is secured through the momentum 
space boundary condition in eqn. (6.54), known in quantum theory as the Kubo- 
Martin-Schwinger relation. Considering the boundary terms in eqn. (6.61) and 
following the procedure in section 13.2.2, one has 


2mc? 2ri f (k)0(ko)8(p?c? +m’c4) > 
d(poc — hax) 


2mc? 2m f (k)O (ko) Thon 


(13.62) 


In the large c limit, ia — mc’, thus the c > oo limit of this term is simply 
271i f (Gx), (13.63) 


where hia, = mc? + hax. 

Intimately connected to this form is the Kubo—Martin—Schwinger (KMS) 
relation. We looked at this relation in section 6.1.5, and used it to derive the form 
of the relativistic Green functions. Notice that the zero-temperature, negative 
frequency parts of the Wightman functions do not contribute to the derivation 
of this relation in eqn. (6.56). For this reason, the form of the relationship in 
eqn. (6.54) is unchanged, 


-GP (6) = eG (6). (13.64) 


This use of the non-relativistic energy in both the relativistic and non-relativistic 
cases is important and leads to a subtlety in the Euclidean formulation. From 
the simplistic viewpoint of a Euclidean imaginary-time theory, the meaning of a 
thermal distribution is different in the relativistic and non-relativistic cases. The 
Boltzmann factor changes from 


e Aiho+me?) y Bho. (13.65) 
This change is reflected also in a change in the time dependence of wave modes, 


etiG+me?/h)r > etidr (13.66) 
The shift is necessary to reflect the change in dynamical constraints posed 
by the equations of motion. However, the Boltzmann condition applies (by 
convention) to the non-relativistic energy. It is this energy scale which defines 
the temperature we know. 

Another way of looking at the change in the Boltzmann distribution is from 
the viewpoint of fluctuations. Thermal fluctuations give rise to the Boltzmann 
factor, and these must have a special causal symmetry: emission followed 
by absorption. These processes are mediated by the Green functions, which 
reflect the equations of motion and are therefore unambiguously defined. As we 
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take the non-relativistic limit, the meaning of the thermal distribution changes 
character subtly. The positive frequency/energy condition changes from being 
(w) = 0(@+mc? /h) to 0 (®©) owing to the re-definition of the zero point energy. 
Looking back at eqn. (6.56), we derived the Bose-Einstein distribution using the 
fact that 


6(—w)e’” = 0. (13.67) 
But one could equally choose the zero point energy elsewhere and write 
6(—(w + Aw) Ot = 0. (13.68) 


As long as the Green functions are free of interactions which couple the energy 
scale to a third party, we can re-label the energy freely by shifting the variable 
of integration in eqn. (5.64). In an interacting theory, the meaning of such a 
re-labelling is less clear. 

In a system which is already in thermal equilibrium, one might argue that the 
interactions are not relevant. Interactions are only important in the approach to 
equilibrium and to the final temperature. With a new definition of the energy, a 
temperature has the same role as before, but the temperature scale 6’ is modified. 

This might seem slightly paradoxical, but the meaning it clear. The KMS 
condition expressed by eqn. (6.54) simply indicates that the fluctuations medi- 
ated by given Green functions should be in thermal balance. The same condition 
may be applied to any virtual process, based on any equilibrium value or zero 
point energy. If we change the Green functions, we change the condition and the 
physics underpinning it. In each case, one obtains an equilibrium distribution 
of the same general form, but the meaning depends on the original Green 
functions. In order to end up with equivalent temperature scales, one must use 
equivalent energy scales. Relativistic energies and non-relativistic energies are 
not equivalent, and neither are the thermal distributions obtained from these. 
In the non-relativistic case, thermal fluctuations comprise kinetic fluctuations in 
particle motion. In the relativistic case, the energy of the particles themselves is 
included. 

Two thermal distributions 


ebe — gh(B+AB)(o+ AB) (13.69) 
are equivalent if 
A 
BAR sO (13.70) 
B w + Aw 


These two viewpoints are related by a renormalization of the energy or chemical 
potential; the reason why such a renormalization is required is precisely because 
of the change in energy conventions which affects the Euclidean formulation. 
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13.5 Energy conservation 
The speed of light is built into the covariant notation of the conservation law 
30” = 0, (13.71) 


We must therefore ascertain whether the conservation law is altered by the limit 
c — œ. From eqns. (11.5) and (11.44), one may write 


1 
0,04" = —3,0™ + 30” 
c 
= 0,0 + 30". (13.72) 
It is apparent from eqn. (11.5) that, as c > o, 


Ooi —> co 


Oio > 0. (13.73) 


Splitting u into space and time components, we have, for the time component, 


1 
3 0”? = -3,0 
C 


n 


[3 git 4 30] 


7 [39 H] = 0. (13.74) 


Because of the limit, this equation is ambiguous, but the result is sensible if we 
interpret the contents of the brackets as being zero. For the space components 
one has 


0,0" +30 =0 
3P + ajo!" = 0, (13.75) 


where o;; is the stress tensor. Thus, energy conservation is preserved but it 
becomes divided into two separate statements, one about the time independence 
of the total Hamiltonian, and another expressing Newton’s law that the rate of 
change of momentum is equal to the applied force. 


13.6 Residual curvature and constraints 


The non-relativistic limit does not always commute with the limit of zero 
curvature, nor with that of dimensional reduction, such as projection in order 
to determine the effective dynamics on a lower-dimensional constraint surface 
[15]. Such a reduction is performed by derivative expansion, in which every 
derivative seeks out orders of the curvature of the embedded surface. Since the 
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non-relativistic limit is also an expansion in terms of small derivatives, there is 
an obvious connection between these. In particular, the shape of a constraint 
surface can have specific implications for the consistency of the non-relativistic 
limit [29, 30, 80, 94, 95]. Caution should be always exercised in taking limits, 
to avoid premature loss of information. 


14 


Unified kinematics and dynamics 


This chapter sews together some of the threads from the earlier chapters to 
show the relationships between apparently disparate dynamical descriptions of 
physics. 


14.1 Classical Hamiltonian particle dynamics 


The traditional formulation Schrödinger quantum mechanics builds on a Ham- 
iltonian formulation of dynamical systems, in which the dynamics describe not 
only particle coordinates q but also their momenta p. The interesting feature 
of the Hamiltonian formulation, in classical mechanics, is that one deals only 
with quantities which have a direct physical interpretation. The disadvantage 
with the Hamiltonian approach in field theory is its lack of manifest covariance 
under Lorentz transformations: time is singled out explicitly in the formulation.' 
Some important features of the Hamiltonian formulation are summarized here 
in order to provide an alternative view to dynamics with some different insights. 

The Hamiltonian formulation begins with the definition of the momentum p; 
conjugate to the particle coordinate g;. This quantity is introduced since it is 
expected to have a particular physical importance. Ironically, one begins with 
the Lagrangian, which is unphysical and is to be eliminated from the discus- 
sion. The Lagrangian is generally considered to be a function of the particle 
coordinates q; and their time derivatives or velocities g'. The momentum is then 
conveniently defined from the Lagrangian, 


OL 


oe 14.1 
då; CE 


Pi 


' Actually, time is singled out in a special way even in the fully covariant Lagrangian 
formulation, since time plays a fundamentally different role from space as far as the dynamics 
are concerned. The main objection which is often raised against the Hamiltonian formulation 
is the fact that the derivation of covariant results is somewhat clumsy. 
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This is not the only way in which one could define a momentum, but it is 
convenient to use a definition which refers only to the abstract quantities L and 
qi in cases where the Lagrangian and its basic variables are known, but other 
physical quantities are harder to identify. This extends the use of the formalism 
to encompass objects which one would not normally think of as positions and 
momenta. The total time derivative of the Lagrangian is 


dilag Ol. OL. OE 


i 4 ; 14.2 
di a ary. ne 
which may be written 
ee ee L (14.3) 
dt |" ag; 
Now, if the Lagrangian is not explicitly time-dependent, on = 0, then the 


quantity in the curly braces must be constant with respect to time, so, using 
eqn. (14.1), we may define the Hamiltonian H by 


H = const. = pq — L. (14.4) 


Notice that this definition involves time derivatives. When we consider the 
relativistic case, timelike quantities are often accompanied by a sign from the 
metric tensor, so the form of the Hamiltonian above should not be treated as 
sacred. 


14.1.1 Hamilton’s equations of motion 


The equations of motion in terms of the new variables may be obtained in the 
usual way from the action principle, but now treating q; and p; as independent 
variables. Using the Lagrangian directly to obtain the action gives us 


S= fa {pq — L}. (14.5) 


However, from earlier discussions about symmetrical derivatives, we know that 
the correct action is symmetrized about the derivatives. Thus, the action is given 
by 


1 
s= fa {5(04 - ab) - L}. (14.6) 


Varying this action with fixed end-points, one obtains (integrating the pq term 
by parts) 


ôS ƏH 
ôq (t) ðq 
5S 0H 


a EG ee (14.7) 
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Hence, Hamilton’s two equations of motion result: 


0H 
p=-— (14.8) 
ðq 
0H 
P 


Notice that this is a pair of equations. This is a result of our insistence on 
introducing an extra variable (the momentum) into the formulation. 


14.1.2 Symmetry and conservation 


One nice feature of the Hamiltonian formulation is that invariances of the 
equations of motion are all identifiable as a generalized translational invariance. 
If the action is independent of a given coordinate 


ƏL 
— =0, (14.10) 
Odn 
then 
aL 
— = Py, =const.; (14.11) 
ddn 


i.e. the momentum associated with that coordinate is constant, or is conserved. 
The coordinate q, is then called an ignorable coordinate. 


14.1.3 Symplectic transformations 


We started originally with an action principle, which treated only q (t) as a dy- 
namical variable, and later introduced (artificially) the independent momentum 
variable p.? The fact that we now have twice the number of dynamical variables 
seems unnecessary. This intuition is further borne out by the observation that, if 
we make the substitution 


q > -p (14.12) 


p>q (14.13) 


in eqn. (14.9), then we end up with an identical set of equations, with only 
the roles of the two equations switched. This transformation represents an 


2 In many textbooks, the Lagrangian formulation is presented as a function of coordinates q and 
velocities g. Here we have bypassed this discussion by working directly with variations of the 
action, where it is possible to integrate by parts and perform functional variations. This makes 
the usual classical Lagrangian formalism redundant. 
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invariance of the Hamilton equations of motion, and hints that the positions 
and momenta are really just two sides of the same coin. 

Based on the above, one is motivated to look for a more general linear 
transformation in which p and q are interchanged. In doing so, one must be 
a little cautious, since positions and momenta clearly have different engineering 
dimensions. Let us therefore introduce the quantities p and g, which are 
re-scaled by a constant Q with dimensions of mass per unit time in order that 
they have the same dimensions: 


p= p/VQ 
g=qVvQ. (14.14) 


The product of g and /p is independent of this scale, and this implies that the 
form of the equations of motion is unchanged: 


oH 


p=-— (14.15a) 
oq 

. OH 

q= aa (14.15b) 
Ê 


Let us consider, then, general linear combinations of q and p and look for all 
those combinations which leave the equations of motion invariant. In matrix 
form, we may write such a transformation as 


q\_(a b q 
-EDG aw 
The derivatives associated with the new coordinates are 
0 1/19 1 ə 
Ay Sm T + re 
dq’ 2\adq bop 


ə 1/19 loa 
= -+ =|. (14.17) 
ap’ 2\cdq dop 


We may now substitute these transformed coordinates into the Hamilton equa- 
tions of motion (14.15) and determine the values of a,b,c,d for which the 
equations of motion are preserved. From eqn. (14.15b), one obtains 


x „~ 1/19ə3H 10H 
aq+bp= + : (14.18) 


2\c dq  ddap 
This equation is a linear combination of the original equation in (14.15) provided 
that we identify 

2ad = 1 

2bc = —-1. (14.19) 
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Substitution into eqn. (14.15a) confirms this. The determinant of the transfor- 
mation matrix is therefore 


ad —bc=1 (14.20) 


and we may write, in full generality: 


a b 1 el? jel? 
U = ( n ) -5 ( ae ) (14.21) 


This is the most general transformation of p, ĝ pairs which leaves the equations 
of motion invariant. The set of transformations, which leaves the Poisson 
bracket invariant, forms a group known as the symplectic group sp(2,C). If 
we generalize the above discussion by adding indices to the coordinates and 
momenta į = 1, ..., n, then the group becomes sp(2n, C). 

Since we have now shown that p and q play virtually identical roles in the 
dynamical equations, there is no longer any real need to distinguish them with 
separate symbols. In symplectic notation, many authors write both coordinates 
and momenta as Q;, where i = 1,..., 2n grouping both together as generalized 
coordinates. 


14.1.4 Poisson brackets 


Having identified a symmetry group of the equations of motion which is general 
(i.e. which follows entirely from the definition of the conjugate momentum in 
terms of the Lagrangian), the next step is to ask which quantities are invariant 
under this symmetry group. A quantity of particular interest is the so-called 
Poisson bracket. 

If we apply the group transformation to the derivative operators, 


B.\_ 
($ )=veo( 


then it is a straightforward algebraic matter to show that, for any two functions 
of the dynamical variables A, B, the Poisson bracket, defined by 


(D4X) (D-Y) — (D-X) (D4Y) = [X, Yq. (14.23) 


Q Q 
Sase 


) ; (14.22) 


is independent of 0 and ¢ and is given in all bases by 
OX OY dY dX 
dq Op ðq dp- 
Notice, in particular that, owing to the product of pq in the denominators, this 
bracket is even independent of the re-scaling by Q in eqn. (14.14). 

We shall return to the Poisson bracket to demonstrate its importance to the 


variational formalism and dynamics after a closer look at symmetry transforma- 
tions. 


[X, ¥]pg = (14.24) 
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14.1.5 General canonical transformations 


The linear combinations of p,q described in the previous section form a 
symmetry which has its origins in the linear formulation of the Hamiltonian 
method. Symplectic symmetry is not the only symmetry which might leave the 
equations of motion invariant, however. More generally, we might expect the 
coordinates and momenta to be changed into quite different functions of the 
dynamical variables: 


q>q(p.4.t) 
p> p'(p.4q.t). (14.25) 


Changes of variable fit this general description, as does the time development of 
p and q. We might, for example, wish to change from a Cartesian description 
of physics to a polar coordinate basis, which better reflects the symmetries of 
the problem. Any such change which preserves the form of the field equations 
is called a canonical transformation. 

It turns out that one can effect general infinitesimal transformations of 
coordinates by simply adding total derivatives to the Lagrangian. This is closely 
related to the discussion of continuity in section 4.1.4. Consider the following 
addition 


dF 
L > L + —, (14.26) 
dt 
for some arbitrary function F(q, p, t). Normally, one ignores total derivatives 
in the Lagrangian, for the reasons mentioned in section 4.4.2. This is because 
the action is varied, with the end-points of the variation fixed. However, if 
one relaxes this requirement and allows the end-points to vary about dynamical 
variables which obey the equations of motion, then these total derivatives (often 
referred to as surface terms in field theory), have a special and profound 
significance. Our programme and its notation are the following. 


e We add the total time derivative of a function F (q, p, t) to the Lagrangian 


so that 
dF 
S —> s+ fa — 
dt 
2 


t 
(14.27) 
t 


1 


=S+F 


e We vary the action and the additional term and define the quantity Gz, 
which will play a central role in transformation theory, by 


ôsF = Gz, (14.28) 
so that 
ôS > 6S+G. (14.29) 
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e We may optionally absorb the change in the variation of the action G into 
the generalized coordinates by making a transformation, which we write 
formally as 


q>q+t+6q=q+R; 6&. (14.30) 


R; is called the auxiliary function of the transformation, and is related to 
G:, which is called the generator of the transformation. This transforms 
the coordinates and momenta by an infinitesimal amount. 


Let us now illustrate the procedure in practice. The variation of the action 
may be written 


os =8 f arp- Hy + fast, 


3H dG dG 
= fo ((-# = =) ôq + —ôq + Eag) , (14.31) 
ðq ðq ðq 


In the last line we have expanded the infinitesimal change 6F = G in terms 
of its components along q and p. This can always be done, regardless of what 
general change G represents. We can now invoke the modified action principle 
and obtain the equations of motion: 


E EEE 
a ae ae PEAR 
5S . ôH ôG `. ôH 
— =0=g —-—+—=(9¢4+69) - —, (14.32) 
ép ép  ôp ép 
where we have identified 
ep 2 8G 
P ðq 
dG 
ôd = —, (14.33) 
op 
or, on integrating, 
dG 
ôp = —-— = RP ôE" 
ðq 
0G 
ôq = — = R15E*. (14.34) 
op 


Notice that G is infinitesimal, by definition, so we may always write it in terms 
of a set of infinitesimal parameters 5€, but € need not include q, p now since the 
q, p dependence was already removed to infinitesimal order in eqn. (14.31). 
It is now possible to see why G is referred to as the generator of infinitesimal 
canonical transformations. 


3 The expressions are not incorrect for p, q variations, but they become trivial cases. 
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14.1.6 Variations of dynamical variables and Poisson brackets 


One of the most important observations about variational dynamics, as far as the 
extension to quantum field theory is concerned, is that variational changes in any 
dynamical variable can be expressed in terms of the invariant Poisson bracket 
between that variable and the generator of the variation: 


6X (p,q,t) =[X, Gelpg (14.35) 


To see this, it is sufficient to use eqns. (14.34) in the differential expansion of 
the function: 


Ox ox 
Se = hide hy, (14.36) 
Ogi OD i 


Substituting for dg; and dp; gives eqn. (14.35). These relations are exemplified 
as follows. 


e Generator of time translations: G; = — H ôt; 


8X = [X, H]6t. (14.37) 


Noting that the change in X is the dynamical evolution of the function, 
but that the numerical value of t is unaltered owing to linearity and the 
infinitesimal nature of the change, we have that 


dx dx 
6X = — (> = èx) dt = [X, H]6t. (14.38) 


Thus, we arrive at the equation of motion for the dynamical variable X: 


Tis [X, H] + sed (14.39) 
d at ` i 
This result has the following corollaries: 
4 = lq, H] 
p = [p, H] 
1 = [HĦ,t]. (14.40) 


The first two equations are simply a thinly concealed version of the 
Hamilton equations (14.9). The third, which is most easily obtained from 
eqn. (14.37), is an expression of the time independence of the Hamilton- 
ian. An object which commutes with the Hamiltonian is conserved. 
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e Generator of coordinate translations: G4 = péq. 


It is interesting to note that, if we consider the variation in the coordinate 
q with respect to the generator for q itself, the result is an identity which 
summarizes the completeness of our dynamical system: 


ôq = (4, Galpq 
ôq = 14, Plp4ôq 
> l = [q, Plpq- (14.41) 
In Lorentz-covariant notation, one may write 
[x“, Pv) aa ae (14.42) 
where p, = (—H/c,p). This result pervades almost all of dynamics 


arising from Lagrangian/Hamiltonian systems. In the quantum theory it 
is supplanted by commutation relations, which have the same significance 
as the Poisson bracket, though they are not directly related. 


14.1.7 Derivation of generators from the action 


Starting from the correctly symmetrized action in eqn. (14.6), the generator of 
infinitesimal canonical transformations for a variable € is obtained from the 
surface contribution to the variation, with respect to €.4 For example, 


a5 = fa ) ep 
q9 = P F q 5 Pod 


1 
5 Go (14.43) 


where we have used the field equation to set the value of the integral in the first 
line to zero, and we identify 


=0+ 


G4 = pôq. (14.44) 
Similarly, 
ð 1 
ôS = | dt (q — — | dp — -48 
= ( P) A 
1 
R Gp, (14.45) 


4 The constants of proportionality are rather inconsistent in this Hamiltonian formulation. If 
one begins with the action defined in terms of the Lagrangian, the general rule is: for actions 
which are linear in the time derivatives, the surface contribution is one-half the corresponding 
generator; for actions which are quadratic in the time derivatives, the generator is all of the 
surface contribution. 
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hence 
Gp = —qôp. (14.46) 
For time variations, 
ôS = —H ôt 
= G;. (14.47) 


The generators are identified, with numerical values determined by convention. 
The factors of one-half are introduced here in order to eliminate irrelevant 
constants from the Poisson bracket. This is of no consequence, as mentioned 
in the next section; it is mainly for aesthetic purposes. 

Suppose we write the action in the form 


S= J {pdq — Hdt}, (14.48) 


where we have cancelled an infinitesimal time differential in the first term. It is 
now straightforward to see that 


as 
a +H =0. (14.49) 


This is the Hamilton-Jacobi equation of classical mechanics. From the action 
principle, one may see that this results from boundary activity, by applying a 
general boundary disturbance F: 


S—> S+ fæ ðL F. (14.50) 
ôF = G is the generator of infinitesimal canonical transformations, and 
dG j 
2q = foo Rp 6&a- (14.51) 
Notice from eqn. (11.43) that 
fioro, = [cong — Dayar); (14.52) 
which is to be compared with 
ôS = pdq — Hdt. (14.53) 
Moreover, from this we have the Hamilton-Jacobi equation (see eqn. (11.78)), 
ôS 1 H 
— = —- | do” 6,9 = —— 14.54 
8x? Cc J eee č ( ) 
or 
ôS 
— +H =0. (14.55) 


368 14 Unified kinematics and dynamics 


14.1.8 Conjugate variables and dynamical completeness 


The commutator functions we have evaluated above are summarized by 


LGA, PB] pq = SAB 
[t, H] = 1. (14.56) 


These equations are a formal expression of the completeness of the set of 
variables we have chosen to parametrize the dynamical equations. Not every 
variational equation necessarily has coordinates and momenta, but every set of 
conservative dynamical equations has pairs of conjugate variables which play 
the roles of p and q. If one is in possession of a complete set of such variables 
(i.e. a set which spans all of phase space), then an arbitrary state of the dynamical 
system can be represented in terms of those variables, and it can be characterized 
as being canonical. 

Ignorable coordinates imply that the dimension of phase space is effectively 
reduced, so there is no contradiction in the presence of symmetries. 

Given the definition of the Poisson bracket in eqn. (14.24), the value of 
[¢, plp = 1 is unique. But we could easily have defined the derivative 
differently up to a constant, so that we had obtained 


Iga, Pelog =a dap. (14.57) 


What is important is not the value of the right hand side of this expression, but 
the fact that it is constant for all conjugate pairs. In any closed, conservative sys- 
tem, the Hamiltonian time relation is also constant, but again the normalization 
could easily be altered by an arbitrary constant. These are features which are 
basic to the geometry of phase space, and they carry over to the quantum theory 
for commutators. There it is natural to choose a different value for the constant 
and a different but equivalent definition of completeness. 


14.1.9 The Jacobi identity and group algebra 


The anti-symmetrical property of the Poisson bracket alone is responsible for 
the canonical group structure which it generates and the completeness argument 
above. This may be seen from an algebraic identity known as the Jacobi identity. 
Suppose that we use the bracket [A, B] to represent any object which has the 


property 
[A, B] = —[B, A]. (14.58) 


The Poisson bracket and the commutator both have this property. It may be seen, 
by writing out the combinations explicitly, that 


[A, |B, C]] +[B,[C, A]] + [C, [A, B]] = 0. (14.59) 
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This result does not depend on whether A, B, C commute. This equation is 
known as the Jacobi identity. It is closely related to the Bianchi identity in 
eqn. (2.27). 

Any objects which satisfy this identity also satisfy a Lie algebra. This is easily 
seen if we identify a symbol 


T,(B) = IA, B]. (14.60) 


Then, re-writing eqn. (14.59) so that all the C elements are to the right, 


[A, [B, C]] — [B, [A, C]] — [[A, B], C] = 0, (14.61) 
we have 
TaTp(C) — TgTa (C) = Tia, g (C), (14.62) 
or 
[T4, Tg] = Tra, g (C). (14.63) 


14.2 Classical Lagrangian field dynamics 
14.2.1 Spacetime continuum 


In the traditional classical mechanics, one parametrizes a system by the coordi- 
nates and momenta of pointlike particles. In order to discuss continuous matter 
or elementary systems, we move to field theory and allow a smooth dependence 
on the x coordinate. 

In field theory, one no longer speaks of discrete objects with positions or 
trajectories (world-lines) q (t) or x(t). Rather x, t take on the roles of a ruler or 
measuring rod, which is positioned and oriented by the elements of the Galilean 
or Lorentz symmetry groups. Schwinger expresses this by saying that space and 
time play the role of an abstract measurement apparatus [119], which means 
that x is no longer q (t), the position of an existing particle. It is simply a point 
along some ruler, or coordinate system, which is used to measure position. The 
position might be occupied by a particle or by something else; then again, it 
might not be. 


14.2.2 Poisson brackets of fields 


The Poisson bracket is not really usable in field theory, but it is instructive to 
examine its definition as an invariant object. We begin with the relativistic scalar 
field as the prototype. 

The Poisson bracket of two functions X and Y may be written in one of two 
ways. Since the dynamical variables in continuum field theory are now ¢,(x) 
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and II4(x), one obvious definition is the direct transcription of the classical 
particle result for the canonical field variables, with only an additional integral 
over all space owing to the functional dependence on x. By analogy with Poisson 
brackets in particle mechanics, the bracketed quantities are evaluated at equal 


times. 
oX oY oY OX 
X,Y = do, — . (14.64 
Benes fo ry avn) CoO 


With this definition we have 


[o(x), Mon] = 6%, x’) 


t=t 


t=t 


fo [o(x), Tn] = 1; (14.65) 


thus, the familiar structure is reproduced. It should be noted, however, that 
the interpretation of these results is totally different to that for classical particle 
mechanics. Classically, q4 (t) is the position of the Ath particle as a function 
of time. a(x) on the other hand refers to the Ath species of scalar field 
(representing some unknown particle symmetry, or different discrete states, 
but there is no inference about localized particles at a definite position and 
particular time). To think of (x), TI (x) as an infinite-dimensional phase space 
(independent variables at every new value of x) is not a directly useful concept. 
The above form conceals a number of additional subtleties, which are best 
resolved by abandoning the Hamiltonian variables in favour of a pure description 
in terms of the field and its Green functions. 

It is now possible to define the Poisson bracket using the fields and Green 
functions, ignoring the Hamiltonian idea of conjugate momentum. In this 
language, we may write the invariant Poisson bracket in terms of a directional 
functional derivative, for any two functions X and Y. 


[X, Y] = DxY — DyX, (14.66) 
where 
Dy = fa an lim 54 14.67 
xY = TAC) m xP (x), (14.67) 
and 
xpa (x) = | (dx’)G", p (x, xX p(x’). (14.68) 


Since we are looking at Lorentz-invariant quantities, there are several possible 
choices of causal boundary conditions, and we must define the causal nature of 
the variations. The natural approach is to use a retarded variation by introducing 
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the retarded Green function explicitly in order to connect the source 6X g to the 
response 5x@. In terms of the small parameter £, we may write 6X g = X g£, or 


dba) = f(a Gg. 0 DK at (14.69) 
Using this in eqn. (14.66), we obtain, in condensed notation, 
[X, Vlg = X.a G Yg- Ya GA? Xp, (14.70) 


or, in uncondensed notation, 


[X,Y] = (dx)(dx’) G4 (x, x) E 
on. as ) 7 SB) 
es G43 (x', x) es (14.71) 
SAG) tO BBD] i 
Now, using eqns. (5.74) and (5.71), we note that 
2 Garl, x’) = G48 (x, x’) — GPA’, x), (14.72) 


so, re-labelling indices in the second term of eqn. (14.71), we have (condensed) 
[X, Ylp = 2Y 4G"? X g (14.73) 


or (uncondensed) 


(x)G4? (x, x’) 2 

oF E ) SBa’) 
The connection between this expression and the operational definition in terms 
of Hamiltonian variables in eqn. (14.64) is not entirely obvious from this 
expression, but we hand-wave the reasonableness of the new expression by 
stretching formalism. From eqn. (5.73), one can write formally 


[X,Y] = 2 fava i) (14.74) 


~ 1 
Gap(x, x’) = ôa Bô (X, X)— (14.75) 
t=t' do 


and thus, hand-wavingly, at equal times, 
ô x ô 5 6 ô ô 
-— GAB —S N — NF +, es | en 2 
dpa dg, Sa d(Ooba) = Sa STI A 


Although we have diverged from a covariant expression in eqn. (14.74) by 
singling out a spacelike hyper-surface in eqn. (14.76), this occurs in a natural 
way as a result of the retarded boundary conditions implicit in the causal 


(14.76) 
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variations. Manifest covariance of notation cannot alter the fact that time plays 
a special role in dynamical systems. Clearly, one has 


d(x) ~x 5 (009 (x')) 
II dy)(d G ———— 
[P(x), T] Loaf y)(dy’) 7 Ç y AB(Y, y’) 5a") 
1 (x) A f ôp x’) 
= | (dy)(d — 00G : 
fi nan Z2 oGag Y) pO 
= 8(x, x). (14.77) 


The Poisson bracket is only unique if the variables are observable, i.e. if they 
are invariant quantities. 


14.3 Classical statistical mechanics 


Statistical mechanics provides a natural point of departure from particle me- 
chanics. Although tethered to classical particle notions in the form of canonical 
Hamiltonian relations, it seeks to take the limit N — oo of infinite numbers 
of discrete particles. It thereby moves towards a continuum representation of 
matter, which is a step towards field theory. To understand field theory fully, 
it is necessary to acknowledge a few of its roots in statistical mechanics. By 
definition, statistical mechanics is about many-particle systems. 


14.3.1 Ensembles and ergodicity 


An ensemble is formally a collection of ‘identical’ systems. The systems in 
an ensemble are identical in the sense that they contain the same dynamical 
variables and properties, not in the sense that each system is an exact image 
of every other with identical values for all its variables (that would be a 
useless concept). The concept of ensembles is useful for discussing the random 
or (more correctly) unpredictable development of systems under sufficiently 
similar conditions. An ensemble is a model for the possible ways in which one 
system might develop, taking into account a random or unpredictable element. 
If one takes a snapshot of systems in an ensemble at any time, the outcome could 
have happened in any of the systems, and may indeed happen in the future in 
any or all of them if they were allowed to run for a sufficient period of time. 
Ensembles are used to discuss the process of averaging over possible outcomes. 

The ergodic hypothesis claims that the time average of a system is the same as 
an ensemble average in the limit of large times and large numbers of ensembles. 
In the limit of infinite time and ensembles, this hypothesis can be proven. The 
implication is that it does not matter how we choose to define the average 
properties of a complex (statistical) system, the same results will always be 
obtained. The ergodic hypothesis is therefore compatible with the continuum 
hypothesis, but can be expected to fail when one deals with measurably finite 
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times or countably finite ensembles. Much of statistical mechanics and much of 
quantum theory assumes the truth of this hypothesis. 


14.3.2 Expectation values and correlations 


Macroscopic observables are the expectation values of variables, averaged over 
time, or over many similar particle systems (an ensemble). The expectation 
value of a dynamical variable X (q, p) is defined by the ensemble average. For 
N particles in a fixed volume V, one has 


[d%qd' p plq, p.t) X(q, p.t) 
fd%qdp pq, p.t) 


where p is the density of states in phase space in the fixed volume V. This is 
sometimes written 


(X) pq = X(t) = 


; (14.78) 


(X) nq = Tr(p X) (14.79) 


The integral in eqn. (14.78) is interpreted as an ensemble average because it 
integrates over every possible position and momentum for the particles. All 
possible outcomes are taken into account because the integral averages over all 
possible outcomes for the system, which is like averaging over a number of 
systems in which one (by the rules of chance) expects every possibility to occur 
randomly. 

Suppose one defines the generating or partition functional Z,,[J(t)] by 


Zpgl J (t)] = faa d™p plq, p, t) TEXY, (14.80) 
and the ‘transformation function’ by 
Wp lJ] = — In Zpg[X O), (14.81) 


then the average value of X can be expressed as a functional derivative in the 
following way: 


_ WIJO] 
(X(t)) = =O (14.82) 
Similarly, the correlation function is 
2 
koroe EO (14.83) 


— EIET AN 


Notice how this is essentially the Feynman Green function, providing a link 
between statistical physics and mechanics through this symmetrical Green 
function. 
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14.3.3 Liouville’s theorem 


An important theorem in statistical mechanics clarifies the definition of time 
evolution in many-particle systems, where it is impractical to follow the trajec- 
tory of every particle. This theorem applies to closed, conservative systems. 

A given point in phase space represents a specific state of a many-particle 
system. The density p of points in phase space can itself be thought of as a 
dynamical variable which deforms with time in such a way that the number 
of points in an infinitesimal volume element is constant. The overall density 
of points is constant in time since the number of particles is constant and the 
volume is a constant, by assumption: 


dp 
— =0, 14.84 
J ( ) 
or, equivalently, 
ð 
T + [0, Hlp =0. (14.85) 


This last form is an expression of how the local density at a fixed point (q, p) in 
phase space (a fixed state) varies in time. When a dynamical system is in static 
equilibrium, the density of states at any point must be a constant, thus 


[o, Hlp; = 0. (14.86) 


In a classical Hamiltonian time development, regions of phase space tend to 
spread out, distributing themselves over the whole of phase space (this is the 
essence of ergodicity); Liouville’s theorem tells us that they do so in such a way 
as to occupy the same total volume when the system is in statistical equilibrium. 

Another way of looking at this is in terms of the distribution function for the 
field. If the number of states does not change, as is the case for a free field, then 


d 


By the chain-rule we may write 


0 Ox! dp! 0 _ 
É + (=) 0; + (=) =| f(p, x) = 0. (14.88) 


The rate of change of momentum is just the force. In a charged particle field 
(plasma) this is the Lorentz force F; = q E; + €; jv! BŽ. 


14.3.4 Averaged dynamical variations 


Since the expectation value is a simple product-weighted average, Liouville’s 
theorem tells us that the time variation of expectation values is simply the 


14.4 Quantum mechanics 375 


expectation value of the time variation, i.e. these two operations commute 
because the time derivative of p is zero: 


se ) “TH X) 
— —Tr 
de ap 
ee’ eet e 
= Tr— ro — 
di Pat 
dX 
= Iro— 
Par 
-(F | (14.89) 
dt | ng 
This may also be written as 
y= RH (14.90) 
dt a Or) ey g l 


or, more generally for variations, as 


Again, the similarity to the mechanical theory is striking. 


14.4 Quantum mechanics 


The discovery of de Broglie waves in electron diffraction experiments by 
Davisson and Germer (1927) and Thomson (1928) effectively undermined 
the status of particle coordinates as a fundamental dynamical variable in the 
quantum theory. The wavelike nature of light and matter cannot be reconciled 
with discrete labels q4 (t) at the microscopic level. A probabilistic element was 
necessary to explain quantum mechanics. This is true even of single particles; it 
is not merely a continuum feature in the limit of large numbers of particles, such 
as one encounters in statistical mechanics. Instead it was necessary to find a new, 
more fundamental, description of matter in which both the wavelike properties 
and impulsive particle properties could be unified. Such a description is only 
possible by a more careful study of the role played by invariance groups. 

Because of the cumbersome nature of the Poisson bracket for continuum 
theory, continuum theories are not generally described with Poisson algebras. 
Instead, an equivalent algebra arises naturally from the de Broglie relation 
Py = hk,: namely commutator algebras. The important properties one wishes 
to preserve are the anti-symmetry of the conjugate pair algebra, which leads to 
the canonical invariances. 

In classical mechanics, q(t) does not transform like a vector under the action 
of symmetry groups, dynamical or otherwise. A more direct route to the 
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development of the system is obtained by introducing an eigenvector basis in 
group space which does transform like a vector and which employs operators to 
extract the dynamical information. 


14.4.1 Non-relativistic quantum mechanics in terms of groups and operators 


Schrédinger’s formulation of quantum mechanics is postulated by starting with 
the Galilean energy conservation relation 
p? 
E==— +V (14.92) 
2m 
and making the operator replacements E — ifd, and p —> —ifV. The 
solution of this equation, together with the interpretation of the wavefunction 
and a specification of boundary conditions, is quantum mechanics. It is 
interesting nonetheless to examine quantum mechanics as a dynamical system 
in order to identify its relationship to classical mechanics. The main physical 
assumptions of quantum mechanics are the de Broglie relation p, = hk, 
and the interpretation of the wavefunction as a probability amplitude for a 
given measurement. The remainder is a purely group theoretical framework 
for exploiting the underlying symmetries and structure. 

From a group theoretical viewpoint, quantum mechanics is simpler than 
classical mechanics, and has a more natural formulation. The use of Poisson 
brackets to describe field theory is not practical, however. Such a formulation 
would require intimate knowledge of Green functions and boundary conditions, 
and would involve complicated functional equations. To some degree, this is 
the territory of quantum field theory, which is beyond the scope of this work. In 
the usual approach, canonical invariances are made possible by the introduction 
of a vector space description of the dynamics. It is based upon the algebra of 
observables and the method of eigenvalues. 


The wavefunction or field Since a particle position cannot be a wave (a particle 
is by definition a localized object), and neither can its momentum, it is postulated 
that the wavelike nature of quantum propagation is embodied in a function 
of state for the particle system called the wavefunction and that all physical 
properties (called observables) can be extracted from this function by Hermitian 
operators. The wavefunction y (x, t) is postulated to be a vector in an abstract 
multi-dimensional Hilbert space, whose magnitude and direction contains all the 
information about the particle, in much the same way that phase space plays the 
same role for classical particle trajectories. 

The fact that the wavefunction is a vector is very convenient from the point 
of view of the dynamics (see section 8.1.3), since it means that the generators of 
invariance groups can operate directly on them by multiplication. This leads to a 
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Table 14.1. Dynamical formulations. 


Classical Schrödinger Heisenberg 


q(t) AW (x,t) XO) W(x) 
P(t) Py (x,t) POW) 


closer connection with the group theory that explains so much of the dynamics. 
It means that any change in the system, characterized by a group transformation 
U, can be expressed in the operational form 


yw =U y. (14.93) 


This is much simpler than the pair of equations in (14.34). It is, in fact, more 
closely related to eqn. (14.35) because of the group structure in eqn. (14.63), as 
we shall see below. 


Operator-valued position and momentum q(t) and p(t) may be effectively 
supplanted as the dynamical variables by the wavefunction. To represent 
the position and momentum of particles, one makes a correspondence with 
operators according to one of two equivalent prescriptions (table 14.1). The 
choice depends on whether one wishes to place the time development of the 
system in the definition of the operators, or whether it should be placed in 
the wavefunction, along with all the other dynamical parameters. These two 
descriptions are equivalent to one another in virtue of the group combination 
law. We shall mainly use the Schrödinger representation here since this is more 
in tune with the group theoretical ideology of symmetries and generators. 

As explained in section 11.1, it is the operators themselves, for dimensional 
reasons, which are the positions and momenta, not the operators multiplying 
the fields. The observable values which correspond to the classical quantities 
are extracted from this function by considering the eigenvalues of the operators. 
Since the wavefunction w(x) can always be written as a linear combination 
of the complete set of eigenvectors E(x) belonging to any operator on Hilbert 
space, with constants A,, 


Wx) = Yo daa), (14.94) 


there is always a well defined eigenvalue problem which can convert a Hermitian 
operator into a real eigenvalue. 
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Commutation relations Since the field Poisson bracket is unhelpful, we look for 
a representation of position and momentum which distills the important property 
in eqn. (14.57) from the classical canonical theory and injects it into the quantum 
theory. One sees that, on choosing the following algebraic representation of the 
underlying Galilean symmetry group for the wavefunction? 


oS ae (14.95) 
k 
a simple representation of the operators X and p may be constructed from 
Lx 
P = —ifV. (14.96) 


These operators live on the vector space of the Galilean group (i.e. real space), 
so it is natural to use their operator property directly in forming a canonical 
algebra. They are complete, as may be verified by computing the straightforward 
commutator 


[X, P] = Xp — px = ih. (14.97) 


This clearly satisfies eqn. (14.57). Thus, with this representation of position 
and momentum, based directly on the underlying symmetry of spacetime, there 
is no need to introduce an abstract phase space in order to construct a set 
of vectors spanning the dynamics. The representations of the Galilean group 
suffice. The only contribution from empirical quantum theory is the expression 
of the wavenumber k and the frequency w in terms of the de Broglie relation. In 
fact, this cancels from eqn. (14.97). 


Dirac notation: bases In Dirac notation, Hilbert space vectors are usually 
written using angle brackets (|x), |y)). To avoid confusing this notation with 
that for expectation values, used earlier, we shall use another fairly common 
notation, |x), |Y), here. The components of such a vector are defined with 
respect to a particular basis. Adjoint vectors are written (x|, (Y|, and so on. 

The scalar product of two such vectors is independent of the basis, and is 
written 


(Wily) = J do yi (x) Wo(x) 


Z J (apy (p)Wa(p). (14.98) 


In Dirac notation one considers the functional dependence of wavefunctions to 
be the basis in which they are defined. Thus, y(x) is likened to the components 


5 Note that the representations of the Galilean group are simply the Fourier expansion. 
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Table 14.2. Matrix elements and operator bases. 


Ô (x"|O|x) (p'|O|p) 
p —ihV 5 (x,x’) po(p, p’) 
î xô(x, x’) —ih 5 5(P. p’) 


of the general function y in an x basis. Similarly, the Fourier transform w(p) 
is thought of as the components of w in a p basis. As in regular geometry, the 
components of a vector are obtained by taking the scalar product of the vector 
with a basis vector. In Dirac notation, the wavefunction and its Fourier transform 
are therefore written as 


Wx) = ly) 
Vv (Pp) = (ply), (14.99) 


as a projection of the vector onto the basis vectors. The basis vectors |x) and 
|p) form a complete set of eigenstates of their respective operators, satisfying 
the completeness relation 


(x, x’) = d(x, x’). (14.100) 
Similarly, a matrix, or operator is also defined by an outer product according to 
what basis, or type of variable, it operates on. The identity operator in a basis x 
is 

T= f iwo, (14.101) 
and similarly, in an arbitrary basis £, one has 


f= [ cocteve. (14.102) 


See table 14.2. This makes the scalar product basis-independent: 
(WilW2) = J dox (yil) |W), (14.103) 
as well as the expectation value of Ô with respect to the state |y): 


(WIOlW) = J s0.d0. Ix’) Olx)@alp), (14.104) 
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Transformation function The scalar product (¥|w2) represents an overlap of 
one state of the system with another; thus, the implication is that transitions 
or transformations from one state to another can take place between these two 
states. (W2(x2)|W1(%1)) is often called the transformation function. It represents 
the probability amplitude of a transition from y (x1) to Y2(x2). The quantity 


A = (w'|Olw) (14.105) 


is not an expectation value, since it refers to two separate states; rather, it is to be 
interpreted as another transition amplitude, perturbed by the operation O, since 


Ôly) = Iy"). (14.106) 
Thus 


A = (lv), (14.107) 


which is just another transition function. The transformation function plays 
a central role in Schwinger’s action principle for quantum mechanics, and is 
closely related to the path integral formulation. 


Operator variations and unitary transformations In order to define a variational 
theory of quantum mechanics, meaning must be assigned to the variation of 
an operator. An operator has no meaning without a set of vectors on which 
to operate, so the notion of an operator variation must be tied to changes in 
the states on which it operates. States change when they are multiplied by the 
elements of a transformation group U: 


Iy) > Uly). (14.108) 
Similarly, the adjoint transforms by 
(wl > (wlUt. (14.109) 


The invariance of the scalar product (W|) implies that U must be a unitary 
transformation, satisfying 


Ui =U", (14.110) 


Consider an infinitesimal unitary transformation with generator G such that U = 
exp(—iG/h). 


ly) > eT") = A -iG/A)|y). (14.111) 
The change in an expectation value due to an operator variation È —> KON: 


WIR + 8X |W) = (yles ReiS y) 
= (yid +iG/hA)X(1 —iG/h)|w), (14.112) 
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or, equating ô X to the first infinitesimal order on the right hand side, 
A l a 
6X = zl G]. (14.113) 
i 


Eqn. (14.113) may be taken as the definition of operator variations, affected 
by unitary transformations. It can be compared with eqn. (14.35) for the 
canonical variations. It is this definition which permits an action principle to 
be constructed for quantum mechanics. From eqn. (14.113), one can define the 
expectation value 


ih (X) = (€|[X, GJE), (14.114) 
and, by a now familiar argument for the time variation G; = —H ôt, 
diya a lix H] (14.115) 
d \əat ii ° f’ l 


where the expectation value is interpreted with respect to a basis € in Hilbert 
space: 


(j= ic (El... 18). (14.116) 


This relation can be compared with eqn. (14.90) from classical statistical 
mechanics. 
It is straightforward to verify Hamilton’s equations for the operators by taking 


X h? 
H = -— V? +V, (14.117) 
2m 
so that 
A Baa A x 
p = —Ip, H] = -VH 
ih 
A 1. oa , V? V P aH 
q= —[q, H] = ih | ——x|=-ihA- = — =. (14.118) 
ih 2m m m op 


The last step here is formal, since the derivation with respect to an operator 
is really a distribution or Green function, which includes a specification of 
boundary conditions. In this case, the only possibility is for causal, retarded, 
boundary conditions, and thus the expression is unambiguous. 


Statistical interpretation By comparing quantum expectation values, or scalar 
products, with statistical mechanics, it is possible to see that the states referred 
to in quantum mechanics must have a probabilistic interpretation. This follows 
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directly from the canonical structure and from the analogy between the quantum 
state function and the density operator in eqn. (14.78). 

If it were not already clear from phenomenology, it should now be clear from 
eqn. (14.37) that the quantum theory has the form of a statistical theory. Thus, 
the wavefunction can be regarded as a probabilistic object, and the waves which 
give rise to quantum interference are ‘probability waves’. 

The basis-independence of the quantum expectation value is analogous to the 
ergodicity property of classical mechanics: it implies that it is not important 
what variable one chooses to average over. A ‘dynamically complete’ average 
will lead to the same result. 

The formalism of quantum theory makes no statements about wave-particle 
duality and no confusion arises with regard to this concept. Quantum mechanics 
must be regarded as a straightforward generalization of classical canonical 
mechanics, which admits a greater freedom of expression in terms of group 
theory and invariances. 


Classical correspondence Although sometimes stated misleadingly, the corre- 
spondence between the Poisson bracket in classical mechanics and the commu- 
tator in quantum mechanics is not such that one recovers the Poisson bracket 
formulation from the classical limit of the commutator. They are completely 
independent, since they refer to different spaces. While the commutator function 
exists in the classical limit A — 0, the wavefunction does not, since k —> oo 
and w — oo. Thus, the basis vectors cease to exist. 

The true correspondence with classical physics is through expectation values 
of quantum operators, since these are independent of the operator basis. The 
classical theory is through the equations 
p? 


+V =E, (14.119) 
2m 


y Mav in ly 

— — = 1 — E 
2m ot 

and 


dp\ _ d(p) 
dt/ dt 


= gen 
= z MIP: HJ) 


TOVO) - V(x)p) 
= i invvo) 
= -VV (x)), (14.120) 


which is Newton’s law. 
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14.4.2 Quantum mechanical action principle 


Schwinger has shown that the complete unitary, dynamical structure of quantum 
mechanics can be derived from a quantum action principle, based on operator 
variations. The quantum action principle is directly analogous to its classical 
counterpart. We shall return to this quantum action principle in chapter 15 
since it plays a central role in modern quantum field theory. For now, the 
action principle will not be proven; instead we summarize the main results. The 
algebraic similarities to the classical action principle are quite remarkable. 

The central object in the quantum theory is the transformation function or 
transition amplitude (w|w’). The quantum action principle states that the action 
is a generating functional, which induces changes in the transformation function, 


1 is 
SHOWED) = FO leS C), (14.121) 


where $ is the action operator, which is constructed from the classical action by 
replacing each dynamical object by its operator counterpart: 


A h l ak A 
Sp = J dt (506 — qp) — A) : (14.122) 
ti 


In this simple case, the ordering of the operators is unambiguous. The 
variation in the action contains contributions only from within the time values 
corresponding to the end-points of the transformation function for causality. 

If one now introduces the identity Z = |x) x (x| into the transformation func- 
tion and substitutes the real-space representations of the operators, eqn. (14.121) 
becomes 


ôV ly t)) = 
1 n ; f E h 2 l 
zS (dx)y' (x)| = (-iAVx + iħXxV) + — V^ — V | y(x), 
in J, 2 2m 
(14.123) 


which is equal to 


bm) a) = 


1 f° ‘ iP AÀA Pau 
z) (dx) y‘) -7 (a = i) pave v w(x). (14.124) 
ih Ja 2 2m 


6 refers only to the contents of the square brackets. This expression may 
be compared with the action for the Schrödinger field in eqn. (17.4). For 
the purposes of computing the variation, the form in eqn. (14.122) is most 
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convenient. In fact, it is easy to see that — given that the variation of an operator 
is well defined — the results are exactly analogous to the classical case. 
If the end-points of the variation satisfy fixed boundary conditions, then 


b(W(t)|W(n)) = 0, (14.125) 


since the field is constrained to admit no unitary transformations there, thus 
the right hand side of eqn. (14.121) is also zero. This, in turn, implies that the 
variation of the action operator vanishes, and this leads to the operator equations 
of motion, analogous to Hamilton’s equations: 


7 . OH 
dS = fa (- — =) Ox, (14.126) 
ox 


whence 
p= -—. (14.127) 


Similarly, the variation with respect to the momentum operator leads to 


A 


i- —, (14.128) 


whose consistency was verified in eqns. (14.118). This tells us that quantum 
mechanics, with commutators in place of Poisson brackets and differential 
operators acting on a Hilbert space, forms a well defined Hamiltonian system. 
Egn. (14.124) shows that this is compatible with Schrédinger field theory. The 
final piece of the puzzle is to generalize the variations of the action to include 
non-fixed end-points, in a way analogous to that in section 14.1.7. Then, using 
the equations of motion to set the bulk terms to zero, one has 


1 
dW (t)IW(n)) = zz VIG — Gily &)), (14.129) 
which shows that the extended variation merely induces an infinitesimal unitary 


transformation at the end-points of the variation. This variation is in accord with 
eqn. (14.113), and one may verify that 


1 
bx = —[%&,G 
$ at xl 


1 ADAR 
—[$, pdx], (14.130) 
ih 


which immediately gives the fundamental commutation relations 


[X, p] = ih. (14.131) 
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This final piece of the puzzle verifies that the operator variational principle 
is self-consistent for quantum mechanics. In fact, it can be generalized to 
other operators too, as we shall see in chapter 15, when we consider the fully 
quantized theory of fields. 


14.4.3 Relativistic quantum mechanics 


A Lorentz-invariant theory of quantum mechanics may be obtained by repeating 
the previous construction for the non-relativistic theory, replacing the non- 
relativistic energy relation in eqn. (14.92) with 


E? = petm h. (14.132) 
One writes 
(—E? + prc? + m’c*)$(x) = 0, (14.133) 


where Ê = ihd, and P = —ihV, and we call the field ¢ (x) to distinguish it from 
the non-relativistic field. This leads us directly to the Klein—Gordon equation 


(eO +m’c*)¢ = 0. (14.134) 


However, all is not straightforward. The interpretation of this equation is full of 
subtleties, which leads inexorably to a full quantum field theory. To begin with 
its quadratic nature implies that it has solutions of both arbitrarily large positive 
and negative energy (see section 5.1.3). This further implies that the conserved 
quantities normally used to define probability measures can also be negative; this 
is difficult to interpret. Ultimately, the assumptions of quantum field theory save 
the relativistic formulation. Leaning on these, relativistic quantum mechanics 
survives as an approximation to the more complete quantum field theory under 
conditions of ‘sufficient stability’.° 


State vectors and wavefunctions In non-relativistic quantum mechanics it was 
easy to choose state vectors satisfying the Schrödinger equation because of the 
simple form of the conserved quantities arising from the linear time derivative 
(see eqn. (12.39)). The structural symmetry of the natural inner product: 


Cae J do vive, (14.135) 


means that the state vectors |w,) and the adjoint (| were simply Hermitian 
conjugates of one another. In the case of the Klein—Gordon equation, matters 
are more complicated. The corresponding invariant inner product is 


(1. do) = ihe? J do $¥ o dr (14.136) 


6 To make this woolly statement precise, one needs to address issues in the language of quantum 
field theory or renormalization group, which we shall only touch on in chapter 15. 
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the symmetry of which is made less obvious by the time derivative, and one is 
now faced with both positive and negative energy solutions. These two sets of 
solutions decouple, however. If one splits the field into its positive and negative 
energy parts, 


pa) = oP (x) + GO (x), (14.137) 


then one has, for a real scalar field, 


(P(x), (x) = GP a), PPE + OOH), pOK) 


= 0; (14.138) 
i.e. 
(OPE), AP (x) = -4O a), OO ()) 
(@ (x), 6 (x)) = 0. (14.139) 


or, more generally, 


(pa, bs) = —(bs, ba)”. (14.140) 


By analogy with the non-relativistic case, we wish to view this scalar product 
as the definition of a vector space with vectors |ġ) and adjoint vectors (@|, such 
that 


(p1162) = (1, $2), (14.141) 


i.e. the inner product on the vector space is identified with the conserved quantity 
for the field. The @, satisfy the Klein—Gordon equation: 


dk iy 22 24 
P(x) = J Gay’ “b(k)S(pc* + m c|) 


d'k I ikx 
x J (27)” ech ($ (po, p) + (— po, p)) 


= J dve (6 (p) + 40 (p)). (14.142) 


What makes the relativistic situation different is the fact that the energy 
constraint surface is quadratic in kọ. The volume measure on momentum space 
is constrained by this energy relation. This is the so-called mass shell. On the 
manifold of only positive energy solutions, the volume measure is 


d”+ik 22 24 
V: = J Ory rP c + m°c™)O (ko) 


2 d"k 1 
~ J (2T) 2kyc2h 
d"k 1 


k (14.143) 


~ On)" 2koc?h? 
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Thus, if we examine complete sets of position and momentum eigenfunctions 
on this constraint manifold, we find that the normalization of momentum 
eigenfunctions is dictated by this constraint: 


(x, x’) = 5(x, x’) 


=| d"k eik- œx’) 
2r)” 


= 2koħ? c? J dype @-*) (14.144) 


From this expression, it follows that 


(K|x) = \/2kph2c?2 elk (14.145) 


(kik) = J dø (k|x) (x|k’) = 2koh?c? 5(k — k’). (14.146) 
Thus the one-particle positive energy wavefunction is 


A E J AVe(xth Kio) 


=N J dVi 2koħ?c? elk x, (14.147) 


Compare this with the re-scaling in eqn. (13.7). It is normalized such that 
(Y, Y) = (@ilgi) = 1 
=? J AVA Ve bt (k) (2koh?c’) $1 (k)â (k — k’) 


j d”+1k 3 
=N —— |1(k) |’. 14.148 
ayer AA) (14.148) 
The normalization factor, N, is fixed, at least in principle, by this relation, 
perhaps through box normalization. This inner product is unambiguously 
positive, owing to the restriction to only positive energies. An example is the 
one-particle wavefunction in 3 + 1 dimensions: 
$ $ ) vf dk eik-x 
= 1 x)= TETERE 
2a) /2koh2c2 


5 
= const. (=) * HP Gmx), (14.149) 
x 4 


where H Pa) is a Hankel (Bessel) function. What is significant here is that 


the one-particle wavefunction is not localized like a delta function. Indeed, it 
would be impossible to construct a delta function from purely positive energy 
functions. Rather, it extends in space, decaying over a distance of order ñ /mc?. 
See also section 11.2. 
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14.5 Canonically complete theories 


The operational view of classical, statistical and quantum mechanics, which 
has been lain out above, could seem sterile from a physical perspective. In 
presenting it as a formal system of canonical equations, one eschews phe- 
nomenology entirely and uses only elementary notions based on symmetry. That 
such an approach is possible is surely an important insight. Mechanics should 
be regarded for what it is: a description of dynamics in terms of algebraic rules 
determined from necessary symmetries. Given the mathematical structure, more 
physical or philosophical discussions can follow of their own accord. 

The Hamiltonian dynamical formulation can, for the most part, be circum- 
vented completely by direct use of the action formalism in chapter 4. Again, we 
use a version of the action principle in which we allow infinitesimal canonical 
changes at the end-points of dynamical variations. 

The quantum theory, being linear, is essentially a theory of small disturbances. 
The imprint left on the action by variation with respect to some variable is that 
variable’s conjugate quantity. The conjugate quantity is said to be the generator 
of the variation of disturbance. If one varies the action with respect to a set of 
parameters é', and the action is invariant under changes of these parameters, the 
variation must be zero. Manipulating the symbols in the action and separating 
out the variation ôé to first order, one can write the infinitesimal variation in the 
form 


ôsS = foo Gidé' = 0, (14.150) 


where do” represents a spacelike hyper-surface. The quantity G; is the 
generator of the symmetry in &'. It is also called the variable conjugate to &'. 
Notice that an external source Jext, such that 


S—>S+ fæ JextP (14.151) 


acts as a generator for the field, throughout the spacetime volume 
6S > 0+ fæ JextdQ, (14.152) 


since the dynamical variation of the regular action vanishes. This observation 
has prompted Schwinger to develop the quantum theory of fields almost entirely 
with the aid of ‘sources’ or generators for the different dynamical and symmet- 
rical entities [119] (see table 14.3). 

In this chapter, we have compared the way in which classical and quantum 
mechanics are derivable from action principles and obey canonical completeness 
relations, in the form of Poisson brackets or commutators. This is no accident. 
Since the action principle always generates a conjugate momentum, as exhibited 
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Table 14.3. Some canonical transformations. 


T; séi Symmetry 
J do", ôx” Lorentz invariance 
f do" Tuy ôx” conformal invariance 
p ox translational invariance 
H ot time translation invariance 
II ôq, ô, ôy  spacetime/canonical 
Jezi bo, ow field canonical/unitary 


by eqns. (4.62) and (4.66), one could define a canonical theory to be one which 
follows from an action principle. Thus, the action principle is always a good 
starting point for constructing a dynamical model of nature. To complete our 
picture of dynamics, it is necessary to resolve some of the problems which haunt 
the fringes of the classical relativistic theory. To do this, one must extend the 
dynamical content of the theory even more to allow the fields themselves to 
form a dynamical basis. As we saw in section 14.2.2, this was impractical using 
the Poisson bracket, so instead one looks for a commutator relation to generate 
infinitesimal variations. This completes the consistency of the formalism for 
relativistic fields. 
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Epilogue: quantum field theory 


Where does this description of matter and radiation need to go next? The answer 
is that it needs to include interactions between different physical fields and 
between different excitations in the same field. 

In order to pursue this course, one needs to extend the simple linear response 
analysis of classical field theory to a non-linear response analysis. In the 
presence of interactions, linear response is only a first-order approximation to 
the response of the field to a source. Interactions turn the fields themselves into 
sources (sources for the fields themselves and others to scatter from). Non-linear 
response theory requires quantum field theory, because the products of fields, 
which arise in interactions, bring up the issue of the ordering of fields, which 
only the second quantization can resolve. It means the possibility of creation and 
annihilation of particle pairs and negative probabilities, which the anti-particle 
concept and the vacuum concept repair the consistency. 

An area which has not been touched upon in this book is that of Grassman 
variables [136], which describe fermionic interactions. These arose fleetingly in 
connection with TCP invariance (see section 10.5). In interacting theories, one 
needs to account for their anti-commuting properties. 

A full discussion of quantum field theory, and all of its computational 
algorithms, is far beyond the scope of this book. The purpose of this chapter 
is only to indicate briefly how the quantum theory of fields leads to more of 
the same. As usual, the most pleasing way to derive corrections to the classical 
theory within a dynamically complete framework is, of course, through an action 
principle. Schwinger’s generalization of the action principle, for quantum field 
theory, provides the most economical and elegant transition from the classical 
to the quantum universe. It leads, amongst other things, to the so-called 
effective action for quantum field theory. This effective action demonstrates, 
most forcibly, the way in which quantum field theory completes the cause-effect 
response theory we have used in previous chapters. 
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15.1 Classical loose ends 


In classical physics, few problems may be described by single-particle equa- 
tions. In many-particle theories, one normally invokes the continuum hypothesis 
and turns to effective equations in order to study bulk quantities. The same 
strategy is more difficult in the quantum theory, since an effective theory of the 
quantum nature is less obvious. Quantum field theory is a theory of multiple 
quanta, or discrete excitations within a field. More than that, it is a theory of 
multiple identical quanta. This is helpful in overcoming some of the problems 
with classical field theory. 

Quantum mechanics deals primarily with one-particle systems. In quantum 
mechanics, many-particle systems must be handled as discrete N-body prob- 
lems. Identical particles must be handled by cumbersome Slater determinants, 
or explicit symmetrization. In quantum field theory one replaces this with a 
continuum theory of field operators, subject to algebraic constraints. 

It is this last point which leads one to modify quantum mechanics. Instead of 
trying to symmetrize over wavefunctions for identical particles, one uses the nor- 
malization properties of the wavefunction to generate the required multiplicity. 
The identical nature of the particles then follows, subject to certain restrictions 
on the spacetime symmetries of the fields. In particular, it is necessary to specify 
the topology of the field with respect to interchanges of particles. The Pauli 
principle, in particular, places a strong constraint on the spacetime properties of 
the field. 

Finally, the existence of negative energy states requires additional assump- 
tions to avoid the problem of decay. The anti-particle concept and the vacuum 
concept (the existence of a lowest possible state) have a formal expression in 
quantum field theory, but have to be put in by hand in a classical theory. 


15.2 Quantum action principle 


Schwinger has introduced an action principle for quantum mechanics [115, 120] 
which turns out to be equivalent to path integral formulations [47]. In quantum 
field theory one is interested in computing transition or scattering amplitudes of 
the form 


(s2|81), (15.1) 


where the states denoted by sı and s2 are assumed to be complete at each 
arbitrary time, so that 


(51|S2) = finnen da (a, t|s2, t2). (15.2) 
Schwinger’s quantum action principle states that 


i 
ô(s2|s1) = z 8218 S121581) (15.3) 
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Fig. 15.1. Overlap between classical and quantum and statistical theories. 


where S is the action operator, obtained by replacing the classical fields @ by 
field operators ø. The form of the action is otherwise the same as in the classical 
theory (which is the great bonus of this formulation). Specifically, 


S12 = [ow L. (15.4) 


Since operators do not necessarily commute, one must adopt a specific operator- 
ordering prescription which makes the action self-adjoint 


S= sS. (15.5) 


The action should also be symmetrical with respect to time reversals, as in the 
classical theory. Central to quantum field theory is the idea of ‘unitarity’. This 
ensures the reversibility of physical laws and the conservation of energy. In view 
of the property expressed by eqn. (15.2), successive variations of the amplitude 
with respect to a source, 


S>S- I (dx) Jọ, (15.6) 
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lead automatically to a time ordering of the operators: 


(s2ls1) = = (s21(2)]s1) 


bJ (x) 
= f Zisa xida (e, xlo) 
5° _ 6 i 
aT 2h = BF Cay aa SOs) 
R) 
= f (3) (s21 xæ, xda (æ, xlø (x)ls1) 
he 
= (a) (sol T P(x’) (2)I51), 15.7) 


where the T represents time ordering of the field operators. The classical 
limit of the action principle is taken by letting A — 0, from which we obtain 
ôS = 0. Thus, only the operator equations of motion survive. The amplitude 
is suppressed, and thus so are the states. This makes the operator nature of the 
equations of motion unimportant. 


15.2.1 Operator variations 


The objects of variation, the fields, are now operators in this formulation, so we 
need to know what the variation of an operator means. As usual, this can be 
derived from the differential generating structure of the action principle. 

It is useful to distinguish between two kinds of variation: variations which 
lead to unitary transformations of the field on a spacelike hyper-surface, and 
variations which are dynamical, or are orthogonal to, such a hyper-surface. 
Suppose we consider an infinitesimal change in the state |s,), with generator 
G, 


|s2) > |82) + 8]s2) = (1 —iG)|s2), (15.8) 
where G is a generator of infinitesimal unitary transformations U = ef, such 
that 

UU =1. (15.9) 


Note that the transformation is the first term in an expansion of e'®. Then we 
have 


d|a) = —iG|a) 
d(a| = (aliG. (15.10) 
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So if F is any unitary operator, then the change in its value under this unitary 
variation can be thought of as being due to the change in the states, as a result 
of the unitary generator G: 


(a| F'|b) = (aQ +iG) FUL —iG)|b) +---, (15.11) 
which is the first terms in the infinitesimal expansion of 
(a|F'|b) = (a'|\e' Fe" |b’). (15.12) 


Eqn. (15.11) can be likened to what one would expect to be the definition of 
variation in the operator 


(a|F'|b) = (a'|F + 6F |b’), (15.13) 
in order to define the infinitesimal variation of an operator, 


ôF = —i[F, G]. (15.14) 


15.2.2 Example: operator equations of motion 


The same result can also be obtained from Hamilton’s equations for dynamical 
changes. Consider variations in time. The generator of time translations is the 


Hamiltonian 
ôF = © — T) ôt, (15.15) 
ot dt 
since 
ôF = F(t+6t) — F(t). (15.16) 


(The numerical value of t is not affected by the operator transformation.) Hence, 
using our definition, we obtain 


a i[F, H] (15.17) 
SS , 
dt ot eer yt 


which is the time development equation for the operator F. 


15.3 Path integral formulation 


Feynman’s famous path integral formulation of quantum field theory can be 
thought of as an integral solution to the differential Schwinger action principle 
in eqn. (15.3). To see this, consider supplementing the action S with a source 
term [22, 125, 128] 


pean fæ Jo. (15.18) 
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The operator equations of motion are now 


OTSE 15.19 
5p 71 = Fld (15.19) 


and we define E[@] as the operator one obtains from the first functional 
derivative of the action operator. From the Schwinger action principle, we have, 


: (15.20) 
J=0 


6” —i\" 
5J” (010) = (>) (OIT $1) +++ b(%n)|9) 


where the T indicates time ordering implicit in the action integral. This may be 
summarized (including causal operator ordering) by 


(0|0) ; = (ofro (-; [ ovze)]0) (15.21) 
J 


We may now use this to write the operator E[ġ] in terms of functional 
derivatives with respect to the source, 


E[5;](0|0) 7 = J(0|0) 5. (15.22) 


This is a functional differential equation for the amplitude (O|O);. We can 
attempt to solve it by substituting a trial solution 


(0|0) ; = [corti exp (+ fave). (15.23) 


Substituting in, and using J = ins 


0= [cote — J} F[ġ]exp (if avse) 


= f ae{etariexp (> fave) 


. ô i 
— inFLOLS exp (-; favo) h (15.24) 


Integrating by parts with respect, moving the derivative =, yields 


A 
5p? 


0= [a0 | e1o1rto1 + in| T exp (i favo) 


+00 
—i [Fior exp (i J avs) | x (15.25) 


—00 
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Assuming that the surface term vanishes independently gives 


OF 
E(¢] F[o] = —ih—_ 


F (15.26) 


and thus 
F[¢] = C exp (; sw) : (15.27) 


Thus, the transformation function for vacua, in the presence of a source, may be 
taken to be 


(010); = [ soexp ($s - faso). (15.28) 


15.4 Postscript 


For all of its limitations, classical covariant field theory is a remarkable stepping 
stone, both sturdy and refined, building on the core principles of symmetry and 
causality. Its second quantized extension has proven to be the most successful 
strategy devised for understanding fundamental physical law. These days, 
classical field theory tends to merit only an honourable mention, as a foundation 
for other more enticing topics, yet the theoretical toolbox of covariant classical 
field theory underpins fundamental physics with a purity, elegance and unity 
which are virtually unparalleled in science. By dwelling on the classical 
aspects of the subject, this book scratches the surface of this pivotal subject 
and celebrates more than a century of fascinating discovery. 
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Gallery of definitions 


16.1 Units 


SI units are used throughout this book unless otherwise stated. Most books on 
modern field theory choose natural units in which h = c = €) = uo = 1. With 
this choice of units, very many simplifications occur, and the full beauty of the 
covariant formulation is apparent. The disadvantage, however, is that it distances 
field theory from the day to day business of applying it to the measurable world. 
Many assumptions are invalidated when forced to bow to the standard SI system 
of measurements. The definitions in this guide are chosen, with some care, to 
make the dimensions of familiar objects appear as intuitive as possible. 

Some old units, still encountered, include electrostatic units, rather than 
coulombs; ergs rather than joules; and gauss rather than tesla, or webers per 
square metre: 


Old SI 
1 e.s.u. i x 10°? C 
1 erg 1077 J 


1 eV 1.6 x 107? J 
1A 107! m 

1G 1074 Wb m~? 
1y 1075 G 
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16.2 Constants 


Planck’s constant ħ = 1.055 x 107%% Js 
speed of light in a vacuum c = 2.998 x 108 ms"! 
electron rest mass me = 9.1 x 1073! kg 
proton rest mass mp = 1.67 x 107” kg 
Boltzmann’s constant kg = 1.38 x 1073 JK! 
Compton wavelength ħ/mc 
structure constant ; a= oe = 55 
classical electron radius ro = iene = 2.2 x 10715 m 
Bohr radius ay = Z = 0.5292 A 
electron plasma frequency Wp = 4/ xe s7! 

Wp ~ 56V/N rad s7! 
cyclotron frequency Wc = wB = e2 s7! 


16.3 Engineering dimensions 


In n spatial dimensions plus one time dimension, we have the following 
engineering dimensions for key quantities (note that square brackets denote the 
engineering dimension of a quantity): 


velocity (of light) [c] LT! 
Planck’s constant [A] ML’T-! 
electric charge [e] L”+D/2T-2 
gravitational constant G M`!LîT? 
permittivity [€o] MILT? 
permeability [uo] MT*L-3 
structure constant [a] E 


The dynamical variables have dimensions: 


Schrödinger field [y] L”/2 
Dirac field [y] L-"/2 
Klein—Gordon [4] L773 [A]-2 

= L742) /27- > M- E 
Maxwell [A] L!T[A] = MTL-@- YP? 
electric current density J, = 2% [e]L! "T! 


particle number density N L™” 
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The plasma distributions are defined from the fact that their integral over a phase 
space variable gives the number density: 


N(x) = [ov Sov, x) (16.1) 


= fer hw.) (16.2) 


and so on. One is generally interested in the distribution as a function of velocity 
v, the momentum p or the wavenumber k. In common units, where h = c = 1, 
the above may be simplified by setting L = T = M~!. Notice that all coupling 
constants scale with spacetime dimension. 

The constants €o and uo are redundant scales; it is not possible to identify the 
dimensions of the fields and couplings between matter and radiation uniquely. 
Dimensional analysis of the action, allows one to determine only two combina- 
tions: 


[eA] = MLT”! 
e? 
B = LMT, (16.3) 
€0 


These may be determined by identifying J“ A,, as an energy density and from 
Maxwell’s equations, respectively. If we assume that €o and uo do not change 
their engineering dimensions with the dimension of space n, then we can 
identify the scaling relations 


L (16.4) 


x 
? 
Ss 


where ~ means ‘scales like’. The former relation is demanded by the dimen- 
sions of the action; the latter is demanded by the dimensions of the coupling 
between matter and radiation, since the product eA,, must be independent of the 
dimension of spacetime. Although the dimensions of e, A,,, €o and jo are not 
absolutely defined, a solution is provided in the relations above. 

From the above, one determines that the cyclotron frequency is independent 
of the spacetime dimension 


[=| =T!, (16.5) 


m 


and that the structure constant a has dimensions 


2, 
| € an (16.6) 


4r eoħc 
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The so-called plasma frequency is defined only for a given plasma charge 
density p, since 


2 
H =L'T?, (16.7) 


Eom 


Thus, w* = 2% = Vv 
? com com’ 
The Hall conductivity is a purely two-dimensional quantity. The dimensional 
equation J = opE can be verified for n = 2 and og = e? /ħ, but it is noted 
that each of the quantities scales in a way which requires an additional scale for 


n#2. 


16.4 Orders of magnitude 


16.4.1 Sizes 
Planck length Lp J/Gh/c3 = 1.6 x 10735 m 
Planck time T, = Lp/c JGh/c = 5.3 x 107“ s 
Planck mass M, JVhc/G = 2.1 x 1078 kg 
Planck energy Ep = Myc? 1.8 x 10° J = 1.2 x 10!° GeV 
Hall conductance inn = 2 oy = e7/h 
Landau length at kg T l = e? /(4mekgT) = 1.67 x 1075/T m 
Debye length h=/eKT/Ne?=69x /T/Nm 


The Landau length is that length at which the electrostatic energy of electrons 
balances their thermal energy. 


16.4.2 Densities and pressure 


number density (‘particles’ per unit volume) N(x) or py 


number current N° = Nc, Ni = Nv’. N, or Jn p 
mass density current Ji = Ny 
charge density current Ja = Ny 
charge density or other sources Ju 
interstellar gas 10% m~’ 
ionosphere 10°-10'2 m~? 
solar corona 10! m~? 
solar atmosphere 10!8 m~’ 
laboratory plasma 10°<10" m~? 
mean density of Earth 5520 kg m~? 
mean density of Jupiter/Saturn 1340/705 kg m~? 
solar wind particle number density 3-20 cm~? 


magneto-pause number density 105—106 m-? 
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Pressure is denoted by P and has the dimensions of energy density or force 
per unit area. 


16.4.3 Temperatures 


interstellar gas 10° K 
Earth ionosphere 104 K 
solar corona 10° K 
solar atmosphere 10*K 
laboratory plasma 10° K 


super-conducting transition 0-100 K 
Bose-Einstein condensation uK-nK 


16.4.4 Energies 


first ionization energies ~ 10 eV 
Van der Waals binding energy 2 keV 
covalent binding energy 20 keV 
hydrogen bond binding energy 20 keV 
plasma energies, solar wind 1-100 keV 
Planck energy Ep = Mpc? 1.8 x 10° J = 1.2 x 10!° GeV 
Lorentz energy-momentum tensor Ou 
conformal energy-momentum tensor vie 


16.4.5 Wavelengths 


radio waves > 107? m 
microwaves 107? m 
infra-red (heat) 1073—1076 m 
visible light 1076—1077 m 
ultra-violet 1077-107? m 
X-rays 107°—107! m 
gamma rays < 10!!! m 
thermal de Broglie wavelength EEN. TAT 
hydrogen atom at 273 K 2.9 x 107!! m 
hydrogen atom at 1 K 4.9 x 107! m 


electron at 273 K 1.27 x 107°? m 
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16.4.6 Velocities 


speed of light in vacuumc 2.9 x 108 m s7! 


solar wind 300-800 km s~! 
phase velocity r= vink) 
group velocity a = vi (k) 
energy transport velocity E =v, 


16.4.7 Electric fields 


geo-electric field at surface (fine weather) 100 Vm"! 
geo-electric field at surface (stormy weather) 1000 Vm"! 
auroral field 1073-107? Vm"! 


16.4.8 Magnetic fields 


intense laboratory field H H ~ 10° Am! [102] 
highest coercive field H in minerals H ~ 10° Am! [102] 
geo-magnetic field H ~ 10 Am! [102] 
geo-magnetic field Bo = 1.88 x 1075 tesla 
vertical geo-magnetic field B, = Botan ô, ô = declination from north 
Earth dipole moment 7.95 x 10? Am”? 


16.4.9 Currents 


atmospheric current from ionosphere to ground 10- Am”? [50] 
auroral current aligned with field 1077 Am? 
ionospheric dynamo 500 000 A (eastward) 


16.5 Summation convention 


Einstein’s summation convention is used throughout this book. This means that 
repeated indices are summed over implicitly: 


X baba —> haha, (16.8) 
A 
and 
XO A"A, > A*A,. (16.9) 
H 


In other words, summation signs are omitted for brevity. 
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16.6 Symbols and signs 


16.6.1 Basis notation 


Suv the spacetime metric with signature —+-++--- 
Nv the constant Minkowski spacetime metric 
with value diag(—1, 1,1, 1,...) 
g = —detg,,, the unsigned determinant of the metric 
which appears in volume measures 
L,v,A, p... Greek indices are spacetime-covariant and 
run from 0, ...,n inn + 1 dimensions 
i,j,k =1,...,n Latin indices refer to spatial dimensions 
0; shorthand for 2 etc. 
A,B=1,...,dr upper case Latin indices are the components 


of a group multiplet for non-spacetime 
groups, e.g. charge, colour, 
in a general representation G g 


a,b=1,...,dg lower case Latin indices are group 
indices which belong to the adjoint 
representation Gaqj 


o signifies space 
do = dx! ...dx” the spatial volume element 
do” volume element for a spacelike hyper-surface 


do derivative normal to a spacelike hyper-surface 
has the canonical interpretation 3o 


U, or U E matrix element of a transformation group 


Some books make the abbreviation, (Ou), when — in fact — they mean 
(0.@)(0"@). In this text (uo)? means only (ə )(ð p) which differs by 
a factor of the metric. Note that, because of the choice of metric above, 


(dip)? = (8'$) (ib) = (8:) (0:9). 
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16.6.2 Volume elements 


dV, invariant volume element in (n + 1) dimensional 
spacetime; dV, = dx°dx'dx?... dx” /g 


dV, = <dV, the volume element which appears in 
in most dynamical contexts, such as 
the action 


(dx) =dV, alternative notation for dV, 


do” the volume element on spacelike hyper-surface 
with a unit normal n” parallel to do” 


do = (dx) an abbreviation for do®, the ‘canonical’ spacelike 


hyper-surface; do = dx! ...dx"/—detg;; 


The volume element appearing in the action, and in most transformations, is 
(dx), which differs from the spacetime volume element by a factor of 1/c. This 
is because the action has dimensions of energy x time. Had the action been 
defined with an extra factor of c one could have avoided this blemish, but that is 
not traditionally the case. In natural units, i = c = 1, this problem is concealed. 


16.6.3 Symmetrical and anti-symmetrical combinations 


A bar (like a mean value) is used for objects which are symmetrical, e.g. 


1 
x= rice! +x), (16.10) 
whereas a tilde is used to signify anti-symmetry: 
i= (xı — x2). (16.11) 


Similarly, tensor parts Ti and T;; are, by assumption, symmetrical and anti- 
symmetrical parts. 


16.6.4 Derivatives 


Field theory abounds with derivatives. Since we often have use for more 
symbols than are available, some definitions depend on context. 
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the total derivative (this object is seldom used) 


x 
zr =ð, the partial derivative acting on x, e.g. 0,, G(x, x’) 


a generic derivative; it commonly denotes the gauge- 
covariant derivative D, = ð, —ieA, 


the Lorentz-covariant derivative, which includes the ‘affine 
connection’ V, is the same as 0,, when acting on scalar fields, 
but for a vector field V, A” = ð A” + T^, Aa 


V? = VÍV; the spatial Laplacian 


Pi 


Pu 


= V"V,, the d’Alambertian operator; in Cartesian coordinates, 


2 
o= -385 + 3?, but generally O = Fz? (/gg"” dy) 


a partial derivative in which the speed of light is replaced by 
an effective speed of light; also used for higher-dimensional 
indices in Kaluza—Klein theory. 


16.6.5 Momenta 


the kinetic momentum also denoted p; 
the generalization of mass x velocity in classical mechanics 
Quantum theory replaces this by —if ð; 


the n + 1 dimensional spacetime momentum, 

a covariant representation of energy and momentum 
The u = 0 component is E /c and the spatial 
components are p; 


the covariant momentum. It is the analogue of 
P, but includes any covariant connections, 

e.g. the electromagnetic vector potential, or the 
spacetime ‘affine’ connection 

e.g. T, is —ihD, or —1hV,, 


(or simply IT) is the canonical momentum, defined 
by the surface term of the variation of the action (see eqn. (4.23)) 
The covariant definition of the momentum conjugate 


to g(x) is I, = ae where o is a timelike direction; 
e.g. T= TD gay" This quantity does not have the dimensions of 


408 16 Gallery of definitions 


momentum: it is referred to only as a momentum in the sense of 
being canonically conjugate to the field variable g(x) (which does 
not have the dimensions of position) 


A 


ĝ, Pi coordinates and momenta which are re-scaled so as 
to have common engineering dimensions 


. . . . q”+! k 
(dk) Schwinger notation for the integration measure OyT 
(dk) Schwinger notation for the integration measure oan 


16.6.6 Position, velocity and acceleration 


ri =x; —x; aray between spatial position x and x’ 
fi a hat can indicate a unit vector 
vi components of the velocity of an object in the 
laboratory frame 
bi same as above in units of the speed of light 8’ = v’ /c 
B? B' Bi where b! = v' /c 
y relativistic contraction factor 1/,/1 — 8? 


U}, pe velocity (n + 1)-vector; U” = y (c, vi) = yc” 

U* = 0,x* is not directly measurable, but 
transforms as a vector under Lorentz transformations 
BY = 0,x" does not because 9, is not invariant 

a” = 098" components of the acceleration 
in the laboratory frame a! = 0! /c?. 
This quantity does not transform as an (n + 1)-vector 

A" = 0x" acceleration vector, transforms 
as a tensor of rank 1 

2% =v! (k) phase velocity 


<* = v,(k) group velocity 


an energy transport velocity 


16.7 Limits 


It is natural to check derived expressions in various limits to evaluate their 
reliability. Here we list a few special limits which might be taken in this 
context and caution against possibly singular limits. While it might be possible 
to set quantities such as the mass and magnetic field strength to zero without 
incurring any explicit singularities, one should not be surprised if these limits 
yield inconsistent answers compared with explicit calculations in their absence. 
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In many cases the singular nature of these limits is not immediately obvious, 
and one must be careful to set these to zero, only at the end of a calculation, or 
risk losing terms of importance: 


c —> œ 
ħ— 0 
B—0 
m—> 0 
R-0O 


non-relativistic limit 

classical limit 

the limit of zero magnetic field is often singular 
the limit of zero mass is often singular 

the limit of zero curvature is often singular. 
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The Schrodinger field 


The Schrédinger equation is the quantum mechanical representation of the non- 
relativistic energy equation 


p? 
— +V =E (17.1) 
2m 


and is obtained by making the replacement p; —> —iħð; and E = iħð,, and 
allowing the equation to operate on a complex field w(x). The result is the basic 
equation of quantum mechanics 


he z 
-57 + v) y =iho,w. (17.2) 
2m 
which may also be written 


Hpw = iha,w, (17.3) 


thereby defining the differential Hamiltonian operator. The free Hamiltonian 
operator Ho is defined to be the above with V = 0. 


17.1 The action 
The action for the Schrödinger field is 


ho 
gS fioa - OW Gi) — Vy 


A 
+5 Wd —wiw)— yy} Ta) 


Notice that this is not Lorentz-invariant, and cannot be expressed in terms of 
n + 1 spacetime dimensional vectors. 
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17.2 Field equations and continuity 


The variation of the action can be performed with respect to both y(x) and 
w*(x) since these are independent variables. The results are 


2 
yS = fioa ôy* (Zay -Vý +indw) 
ih. , ne ; 
$ [so[ Sou y] + fa [5-va] =o 
h ' 

AR [ soar Su (Sa! ay ypt ind.) 

Ee [e0[-Souw'] + fao [E waw] =0, (175) 

2 2m : 


where we have used integration by parts, and the two expressions are mutually 
conjugate. From the surface terms, we can now infer that the canonical 
momentum conjugate to w(x) is 


Tl =ihy, (17.6) 


and that spatial continuity at an interface is guaranteed by the condition 


h2 
A (Zov) =0, (17.7) 
2m 


where A means the change in value across the interface. 


17.3 Free-field solutions 


The free-field solutions may be written in a compact form as a linear combina- 
tion of plane waves satisfying the energy constraint: 


w(x) = [ de ee NK ick ax—an) wk ©) 
o 2r J- 27)" i 
21,2 


x 6(@) (2 m ha). (17.8) 


The coefficients of the Fourier expansion y(k, @) are arbitrary. 


17.4 Expression for the Green function 


The Schrödinger Green function contains purely retarded solutions. This is a 
consequence of its spectrum of purely positive energy solutions. If one views 
the Schrédinger field as the non-relativistic limit of a relativistic field, then the 
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negative frequency Wightman function for the relativistic field vanishes in the 
non-relativistic limit as a result of choosing only positive energy solutions. The 
Fourier space expression for the free-field Green function is 


Boer +00 d”k elk: Ax—@Ar) 
Gyr(x, x ‘) aj 4 af : (17.9) 
oo Ory (HE _ Aes) — ie 


This may be interpreted in the light of the more sea expression: 


Gyre, x’) = D>) Ot t) un (xu (x’) 


n 


p =] ee (17.10) 


on a — @n +1€ 


where u,,(x) are a complete set of eigenfunctions of the free Hamiltonian, i.e. 
(Ho — En)un(x) = 0, (17.11) 


where E, = hwy. 


17.5 Formal solution by Green functions 


The free Schrödinger Green function satisfies the equation 


RPV? wos 
(- a ind Gyr(x, x) = (x, x’)d(t, t’), (17.12) 
or 
(Ho — E)Gnr(x, x’) = 6(x, XNE (t, t), (17.13) 
and provides the solution for the field perturbed by source J (x), 
p(x) = [Gwa x70. (17.14) 


The infinitesimal source J is not normally written as such, but rather in the 
framework of the potential V, so that J = Vy: 


(Ho — En)Wn = =V Yn, (17.15) 


where w(x) = >>, CnYn(x). Substitution of this into the above relation leads to 
an infinite regression: 


W(x) = [eax renee. x'\I(x') 
= J Gwa Ve WE) 


= J ONNE x", (17.16) 
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and so on. This multiplicative hierarchy is only useful if it converges. It is thus 
useful to make this into an additive series, which converges for sufficiently weak 
V (x). To do this, one defines the free-field ywo(x) as the solution of the free-field 
equation 


(Ho — En) Won(x) = 0, (17.17) 


and expands in the manner of a perturbation series. The solutions to the full-field 
equation are defined by 


Vn(X) = Won(x) + Yn (17.18) 


where the latter terms are assumed to be small in the sense that they lead to 
convergent results in calculations. Substituting this into eqn. (17.15) gives 


(Ho — En) Wn = -V pr Œ), (17.19) 
and thus 
ôy) = — fæ Gyr x) V&V’), (17.20) 
or 
W(x) = ox) — fæ Grr&, x’) VANYAN. (17.21) 


This result is sometimes called the Lippmann—Schwinger equation. The equa- 
tion can be solved iteratively by re-substitution, i.e. 


W(x) = Wo) — fæ Gyre, x") V NYa”) 


+ J (dx')(dx") E E E EE E A Wx"), 
(17.22) 


and generates the usual quantum mechanical perturbation series, expressed in 
the form of Green functions. 


17.6 Conserved norm and probability 
The variation of the action with respect to constant ôs under a phase transforma- 
tion Y — e" wọ is given by 


hi? 
ôS = foo oe [—-i55(0'W*)(0;W) + (0' W*)ids (8; y)] 


+i |-issy*ðy T issy*õy | i (17.23) 
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Integrating by parts and using the equation of motion, we obtain the expression 
for the continuity equation, 


8S = Jaws (a $ a’) =0, (17.24) 
where 
J = y*%y =p 
i ih? * 7 QI i j% 
La [Vv 0 y) — wy], (17.25) 
m 


which can be compared to the current conservation equation eqn. (12.1). p is the 
probability density and J' is the probability current. The conserved probability 
is therefore 


P= [cowrcovwn. (17.26) 


and this can be used to define the notion of an inner product between two 
wavefunctions, given by the overlap integral 


(Yi, Y2) = J owon. (17.27) 


17.7 Energy—momentum tensor 


Replacing nuv by 5, (the Euclidean metric), we have for the components of the 
energy—momentum tensor: 


EE Gy = = 
SS aap. “aay 
ho o, 
= ON iW) +Vy*y, (17.28) 
=H. (17.29) 


In the second-quantized theory, where y(x) is a field operator, this quantity is 
often called the Hamiltonian density operator H. This is to be distinguished 
from Hp, the differential Hamiltonian operator. In the classical case, the spatial 
integral of 6,, is the expectation value of the Hamiltonian, as may be seen by 
integration by parts: 


2 
H= fioo = fa w(x)" -4+ v| W(x) 


= (y, Hoy) 
= (Hp). (17.30) 
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Thus, 6; represents the total energy of the fields in the action S. The off-diagonal 
spacetime components are related to the expectation value of the momentum 
operator 


bi = Bae + (0;*) oe i yaw) = DP ad 
tl — aa,w l $ 3 y* a 2 l 2 l 
J do6,; = (Wiha; y) 
= - (pi), (17.31) 
and 
pave (aw) + P (ð y“) 
t= Cr hae a(aw*) 
pi 
= -7 {aw'dw + OVi}. (17.32) 
m 


Note that 6 is not symmetrical in the spacetime components: 6;; 4 6,;. This is 
a result of the lack of Lorentz invariance. Moreover, the sign of the momentum 
component is reversed, as compared with the relativistic cases, owing to the 
difference in metric signature. Finally, the ‘stress’ in the field is given by the 
spatial components: 


al al 


* = Fa”) T daa 


h2 
= =z [O @W) + OWO — O O) 


(ajw)* — Lôij 


+ Vy - a On W) p ôij- (17.33) 
Using the field equation (17.2), the trace of the spatial part may be written 


h? y 1# S 
Tru = (n — 2)—_(O" Y )* (Ory) +n (vwy o v) 
2m 22m 
=(1-n) H +2Vy*y, (17.34) 


where the last line is obtained by partial integration over all space, and on 
identifying the first and last terms as being H — V, and is therefore true only 
up to a partial derivative, or under the integral sign. See also Jackiw and Pi for 
a discussion of a conformally improved energy-momentum tensor, coupled to 
electromagnetism [78]. 
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The real Klein—Gordon field 


The relativistic scalar field satisfies the Klein—Gordon equation. This equation 
can be interpreted as the quantum mechanical analogue of the relativistic energy 
relation 

E = pe +m, (18.1) 


and is found by making the usual replacement p, — —ihd,, and allowing the 
equation to operate on a real scalar field ¢ (x). The result is 


m?c? 
(- Po ) ou =0. (18.2) 


18.1 The action 


If we generalize the single scalar field above to a set of N real scalar fields 
ga(x) for A = 1,..., N, with a linear perturbation, J4, then all of the physical 
information about this system can be derived from the following action: 


S= | (dx) ) =A c (0"ga)(0 Shon papa + V(d) — Japa 
2 2 i W) l 
(18.3) 


Note that the position of the A indices is immaterial here, since they only label 
the number of the field components. The repeated indices are summed using a 
Euclidean metric, for which there is no notion of ‘up’ or ‘down’ indices. 

Looking at this action, it can be noted that it does not have the familiar form 
of an integral over T — V (kinetic energy minus potential energy). Instead, it 
has the form of an integral over —E? + p? + m? + V. Although this looks 
dimensionally incorrect, this is not the case, since the dimensions of the field 
are simply chosen so that S has the dimensions of action. 
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In what follows, the position of an index A is chosen for clarity. The 
Lagrangian density £ is defined by 


S= Jove. (18.4) 


In the usual canonical tradition, we define the conjugate momentum to the field 
a(x) by 
as ôL 
* (07 pa) 
where o is a specific direction, normal to a spacelike hyper-surface. Usually we 
do not need to be this general and we can just pick o = 0 for the normal, which 


corresponds to the time direction (normal to space in an observer’s rest frame). 
Then we have, more simply, 


= h’c’d,¢, (18.5) 


Ty = fic? dda. (18.6) 


The Hamiltonian density is then obtained straightforwardly from the Legendre 
transformation 


H = (do) — Loo. (18.7) 
Or, using the fully covariant form, 
H= He (00) = L8oo. (18.8) 


Note the positions of the indices here and the presence of the metric in the 
second term of the right hand side. The need for this factor will become 
apparent later when looking at transformations and the energy-momentum 
tensor. It makes the relativistic Legendre transformation more subtle than that 
in Euclidean space, because of the indefinite metric. Eqns. (18.7) and (18.8) 
evaluate to 


H= she [ (dob)? + (aip) ] + smi + Vo). (18.9) 


18.2 Field equations and continuity 


The variation of the action (with V = 0) leads to 


ôS = J (dx) [h° Spa (pa + mc paspa — Jasa} 


C 


+ t fao [5ean]. (18.10) 
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Appealing to the action principle (see chapter 4), we surmise that the field 
equations are 


2:22 
(- i “| ope ears (18.11) 


and that the condition for continuity of the field through any n-dimensional 
surface is 


ATÂ = 0. (18.12) 


If a delta-function source AJ4 = 6j46(x) is added to J, exactly on the surface 
o, then this continuity equation is modified, and the new condition is given by 


Al = Aj^ ng (18.13) 


where n” is the unit normal vector to ø. This equation tells us that a sudden 
change in the momentum of the field can only be caused by an impulsive force 
(source) Aj. 


18.3 Free-field solutions 
The field @(x) may be expanded as a linear combination of a complete set of 
plane wavefunctions satisfying the equation of motion, 

d’t tl 


o Snide el 8 (R? K? + mc), (18.14) 


p(x) = 
where ¢ġ (k) are arbitrary coefficients, independent of x. The reality of the field 
requires that 

@*(k) = ®(-k). (18.15) 


The integral ranges over all energies, but one can separate the positive and 
negative energy solutions by writing 


px) = oP (x) + 9X), (18.16) 


where 
+1% 


pO ax) = [=x caro eC ko)ô (h°c7k* + mc") 


n+lg 


pO) = Is TAWO ko (PPk +m cî). (18.17) 
The symmetry of the energy relation then implies that 


oP (x) = (Ow). (18.18) 
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The positive and negative energy solutions to the free relativistic field equations 
form independently complete sets, with respect to the scalar product, 


(M(x), pP (x)) = const. 
(6 (x), pO (x)) = const. 
(oP (x), GO (x)) = 0. (18.19) 


18.4 Reality of solutions 


It should be noted that the uses of the real scalar field are somewhat limited. The 
boundary conditions one can apply to a real scalar field are only the retarded or 
advanced ones. The solution 


o(x) = fæ G(x, x')J(x') (18.20) 


is only real if the Green function itself is real. This excludes the use of the 
time-ordered (Feynman) Green function. 


18.5 Conserved norm and probability 


Since the real scalar field has no complex phase symmetry, Noether’s theorem 
leads to no conserved quantities corresponding to a conserved inner product. 
It is possible to define an invariant inner product on the manifold of positive 
energy solutions, however. This is what introduces the complex symmetry in 
the non-relativistic limit; 


Poop (18.21) 


has no definite sign. 
Since the relativistic energy equation E? = p?c? + m’c* admits both 
possibilities, we do this by writing the real field as a sum of two parts, 


=o 49, (18.22) 


where p™* = 6. 6 is a complex quantity, but the sum ¢™ + ¢@ is 
clearly real. What this means is that it is possible to define a conserved current 
and therefore an inner product on the manifold of positive energy solutions 6, 


ge, os) = inc f dorioa, a = (01? )*O) (18.23) 
and another on the manifold of negative energy solutions ©. Thus there is 


local conservation of probability (though charge still does not make any sense) 
of particles and anti-particles separately. 
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18.6 Normalization 


The scalar product is only defined for normalizable wave-packet solutions, i.e. 
those for which (¢, #) < oo. A plane wave is a limiting case, which can only be 
defined by box normalization. It does not belong to the Hilbert space. However, 
adopting an invariant normalization in momentum space, one can express plane 
waves simply. Noting that the following construction is both invariant and ‘on 
shell’, i.e. satisfies the Klein—Gordon equation, 


dt lk ik. 22:2 2_4 

s= Eo *9(4k9)5(p7c* + m*c*) 
[ees (18.24) 
J Qr) 2po` l 


Adopting the normalization 


($(P), 9) = 2po 8P — p) T)”, (18.25) 


a positive energy solution takes the form 


o*(p) = e" (vo = yp? + m?) j (18.26) 


18.7 Formal solution by Green functions 


The formal solution of the equations of motion can be written down in terms 
of Green functions. The essence of the procedure is to find the inverse of the 
differential operator on the left hand side of eqn. (18.11). Formally, we may 
write 


2,2 


—1 
mo = (- +75) (Re Ja, (18.27) 


where this is given meaning by comparing it with the expression involving the 
Green function or ‘kernel’ G 4g (x, x’): 


pax) = (he COIE (18.28) 


Comparing eqns. (18.27) and (18.28), we see that G(x, x’) must satisfy the 
equation 


mc? 
(- egos ) Gap(x, x’) = dapd(x, x’), (18.29) 


and thus we see that G 4g (x, x’) is the inverse of the differential operator, insofar 
as ôagô (x, x’) can be regarded as the ‘identity’ operator. 
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In this case, the indices A, B on the Green function are superfluous, since 
Gap (x, x’) = ôaBG (x, x’), (18.30) 


but non-diagonal terms in A, B might be important when the components of the 
field interact. This is the case in a gauge theory, for example. 

The Green function G(x, x’) is not unique: there is still a freedom to choose 
the boundary conditions. By this we mean a specification of how the field is 
affected by changes in the source J, both in the past and in the future. The 
‘causal’ Green function, also referred to as the retarded Green function, is such 
that @(x) is only affected by a change in J(x’) if x > x’. 


18.8 All the Green functions 
The symmetry of the Green functions is as follows: 
Gap(x, x’) = Gear’, x) 
Gap (x, x") = —Gpa(x', x) 
Grag (x, x’) = Grga x’, x). (18.31) 


The symmetrical parts of the Wightman functions may be constructed explicitly. 
For example 


1 
2 


1 / =, f 
= OUTE] 

1 f / * 
a Eo pS (GG )) | 


= iImG (x, x’). (18.32) 


[ee xN +G, x)| 


The retarded, advanced and Feynman Green functions are all constructed from 
causally selective combinations of the Wightman functions. 
GR, xN = -G ', x) 


* 
(e (x, x) = Gi, Giz): (18.33) 
The properties of the step function lead to a number of linear relations: 


G,(x, x’) = —O(t — t')G(x, x’) 

Galx, x’) = 0 — t)G(x, x’) 

G,(x, x’) = Gp(x, x’) — GO (x, x’) 

G,(x,x') = Gp(x, x’) + GM (x, x’) 

Gp(x, x’) = -9(t — GM (x, x) +00 —DNGO (x, x’). (18.34) 
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Some caution is needed in interpreting the latter two relations, which should 
be considered formal. The causal properties of the Green functions distinguish 
G(x, x’), which satisfy the homogeneous eqn. (5.65), from G,, Gp, which 
pose as right-inverses for a differential operator and satisfy an equation such as 
eqn. (5.62). We can investigate this by calculating time derivatives. Starting with 
the definition in eqn. (18.34), we obtain the time derivatives using the relations 
in sections A.1 and A.2 of Appendix A: 


a,Gr(x, x’) = —d(t, t')G(x, x) — O(t — t3 GM (x, x’) 
+0 — 1)0,GO (x, x’), (18.35) 


where eqn. (5.71) was used. The second derivative is thus 


0°Gr(x, x’) = — 0,6(t — t')G(x, x’) — 6(t — t) G(x, x’) 
— &(t — t')4,G(x, x') — a(t — t3 GP (x, x’) 
+0 —1)0GO (x, x’). (18.36) 


The property in eqn. (A.14) was used here. Thus using eqn. (5.73) we may write 


3 Gp(x, x’) = 6(t — t8 — x’) — O(¢ — t3 G(x, x’) 
+0 — 1)0?>GO (x, x"). (18.37) 


From this it should be clear that 


(0 + M’)Gp(x, x’) = 8t — tN — x’) 
— 6(t —t')(—O + M*)G(x, x’) 
+o —1)(O + M*)GO (x, x’) 
= cd(x,x’). (18.38) 


The Green function for the scalar field is directly related to that for the 
electromagnetic field in the Lorentz—Feynman gauge, up to factors of ñ and 


Ho. 


Dwa x = Loh? G(x, x’) Si (18.39) 


18.9 The energy-momentum tensor 


The application of Noether’s theorem for spacetime translations leads to a 
symmetrical energy-momentum tensor. Although the sign of the energy is 
ambiguous for the Klein—Gordon field, we can define a Hamiltonian with the 
interpretation of an energy density which is positive definite, from the zero—zero 
component of the energy-momentum tensor. Using the action and the formula 
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(11.44), we have 


600 = ~~ (oba) — L800 


aL 
co 
=H= she’ [ (dopa)? + (iba) +5 zmc ph + Vp). (18.40) 


This quantity has the interpretation of a Hamiltonian density. The integral over 
all space provides a definition of the Hamiltonian: 


H = fion. (18.41) 


The explicit use of zero instead of a general timelike direction here makes this 
definition of the Hamiltonian explicitly non-covariant. Note that this is not a 
differential Hamiltonian operator analogous to that in eqn. (17.3), but more like 
an expectation value. In the quantum theory (in which the fields are operator- 
valued) this becomes the Hamiltonian operator. 

The off-diagonal spacetime components give 


ƏL 
Ooi = io = TER SAG OPA 


= hc" (doa) ða). (18.42) 


Since there is no invariant inner product for the real scalar field, it is awkward to 
define this as a field momentum. However, on the manifold of positive energy 
solutions ¢“, the integral over all space may be written 


[204 = c f aoe 5 ao) 


=-(™, pico), (18.43) 
where p; = —ifio;. The diagonal spatial components are 
bii = = Sar PA). 
apa) 
= h’? (apa) — EROINA, a) — smtp — V9), 
(18.44) 


where the repeated i index is not summed. The off-diagonal ‘stress’ tensor is 


EDE E 
li = gaiga OIA 


(dipa) Opa), (18.45) 
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where i Æ j. Notice that the trace of the space parts in n + 1 dimensions gives 
X 6: =H- mche} — 2V (6) + @—-DL (18.46) 
so that the full trace is 
OH = g” Osu = —m’c'b4 — 2V (G) + — DEL, (18.47) 


which vanishes in 1 + 1 dimensions in the massless, potential-less theory. 


19 
The complex Klein—Gordon field 


This chapter is a supplement to the material for a real scalar field. Much of the 
previous chapter applies here too. 


19.1 The action 
The free-field action is given by 


S= J (dx) fA? 09a Oupa) + m’ctohpa 
+V (Eiga) — Jiba — Jags}. (19.1) 


The field now has effectively twice as many components as the real scalar field, 
coming from the real and imaginary parts. 


19.2 Field equations and continuity 


Since the complex field and its complex conjugate are independent variables, 
there are equations of motion for both of these. Varying S first with respect to 
pă, we obtain 


ôS = fao (R° ba t+m*c*bs + V'(b4, ba) — Ja) 
+ [20505 onga =0, (19.2) 


which gives rise to the field equation 


22 
(- + C Juw = (WJA), (19.3) 
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and the continuity condition over a spacelike hyper-surface, 
Ap (x)) = ATs = 0, (19.4) 


identifies the conjugate momentum IIe. Conversely, the variation with respect 
to ġa gives the conjugate field equations, which give rise to the field equation 


mg 
(- Pe ) ox = (KPNI), (19.5) 
and the corresponding continuity condition over a spacelike hyper-surface, 


A(do9*(x)) = ATX = 0. (19.6) 


19.3 Free-field solutions 
The free-field solutions for the complex scalar field have the same form as those 
of the real scalar field, 
dtl, 
(2 T jat 1 


(x) = o (kje 8 (h°? k? + mc") ; (19.7) 


however, the Fourier coefficients are no longer restricted by eqn. (18.15). 


19.4 Formal solution by Green functions 


The formal solution to the field equation may be expressed in terms of a Green 
function G 4g(x, x’) by 


ax) = [suo x )Ip(x’), (19.8) 
where the Green function satisfies the equation 
a m?c? 
(- AR) Gag(x, x’) = ô(x, x')ôAB. (19.9) 
Similarly, 
pia) = f hanao, x). (19.10) 


Note that, although the fields are designated as conjugates ¢ and øt, this 
relationship is not necessarily preserved by the choice of boundary conditions. 
If the time-ordered, or Feynman Green function is used (which represents virtual 
processes), then the resulting fields do not remain conjugate to one another over 
time. The retarded Green function does preserve the conjugate relationship 
between the fields, since it is real. 
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19.5 Conserved norm and probability 


Let s be independent of x, and consider the phase transformation of the kinetic 
part of the action: 


S= J (dx)hrc? { (0e o*a" o)} 
ôS = J (dx) [(0“6* (—ids)e") ape") + c.c.] 


= Jaws, (19.11) 
where 
J" = -it’c? (p* a" — p3" g“). (19.12) 


The conserved ‘charge’ of this symmetry can now be used as the definition of 
the inner product between fields: 


($1, %2) = ine | do" (Oi ayds — (uG1)" $2), (19.13) 


or, in non-covariant form, 


($1, @2) = ihe J do ($f 3op2 — (3001)“ H2). (19.14) 


This is now our notion of probability. 


19.6 The energy-momentum tensor 


The application of Noether’s theorem for spacetime translations leads to a 
symmetrical energy-momentum tensor. Although the sign of the energy is 
ambiguous for the Klein—Gordon field, we can define a Hamiltonian with the 
interpretation of an energy density which is positive definite, from the zero—zero 
component of the energy-momentum tensor. Using the action and the formula 
(11.44), we have 


aaa) ee he 
= 0p) lopa) MEZA ON &00 


= hc? [(8ob4) Ooba) + (8:64) (Giha)] + mc? + VO). 
(19.15) 


600 


Thus, the last line defines the Hamiltonian density H, and the Hamiltonian is 
given by 


H = fion. (19.16) 


428 19 The complex Klein—Gordon field 


The off-diagonal spacetime components define a momentum: 


al 
Ooi = Oio = ae 500% ,) iPa) + arpa) iPa) 


= hic? { (0004) ipa) + (Boa) iph) } (19.17) 


Taking the integral over all space enables us to integrate by parts and show that 
this quantity is the expectation value (inner product) of the momentum: 


f dens J do (#*d:aod — (o6")8:) 


= —(¢, pico), (19.18) 
where p = —ifio;. The diagonal space components are given by 
aL al 
Oii = —— (0 iP ) + x — (9 OY ) ai 
Ipa A os) 4 
= 2n7c* (d;6") (0:6) — L, (19.19) 


where i is not summed. Similarly, the off-diagonal ‘stress’ components are given 
by 


aL T act: 
A; = Nida ie” + gapi view 


= h°? (04 Opa) + Opia) 
=h 'e(ġa, PiDj Pa): (19.20) 
We see that the trace over spatial components in n + 1 dimensions is 


X bu = H — 2mo} — 2V ($) + (n — DL, (19.21) 


so that the full trace gives 
OH = g” Ovu = —2m* ctp] — 2V (Q) + (n — DEL. (19.22) 


This vanishes form = V = 0 in 1 + 1 dimensions. 


19.7 Formulation as a two-component real field 


The real and imaginary components of the complex scalar field can be 
parametrized as a two-component vector or real fields p4, where A = 1,2. 
Define 


1 
(x) = Z (x) + ig2(x)). (19.23) 
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Substituting in, and comparing real and imaginary parts, one finds 
D, = 0, +ieAg, (19.24) 
or 
Du Pa = dupa — C€aBAy Pe. (19.25) 
The action becomes 


1 1 
S= fæ [Deden + 5 PAPA 3 1ga) : (19.26) 


Notice that the operation charge conjugation is seen trivially here, due to the 
presence of €4pg, as the swapping of field labels. 


20 
The Dirac field 


The Klein—Gordon equation’s negative energy solutions and corresponding 
negative probabilities prompted Dirac to look for a relativistically invariant 
equation of motion which was linear in time. His equation was formally the 
square-root of the Klein—Gordon equation. 

The Dirac equation leads naturally to the existence of spin F. It is the basic 
starting point for the study of spin-4 particles such as the electron and quarks. It 
also appears in condensed matter physics as the relevant low-energy degrees of 
freedom in the strong-coupling limit of the Hubbard model [92], and has been 
used as an alternative formulation of gravity [84]. 


20.1 The action 
The action for the Dirac field is given by 


1 — => < 
S= | @wf-zihro" a, =v" do 
+ (m2 + Vee —T yI}, (20.1) 


where y and y = yty? are dg-component spinors. The y” are dp x dp 
matrices, defined below. All quantities here are implicitly matrix-valued. They 
have hidden ‘spinor’ indices, which we shall write explicitly at times using 
Greek letters a, B,.... 

The variation of the action with respect to a dynamical change in the field y 
gives the equation of motion for w is found by varying the action with respect 
to Y, and is given by 


(—ihcy"d, + mc? +V)w = J. (20.2) 
If we drop the source term J, this can also be written 
ihcdoW = y? (—iħcy' 3; + mc? + V)W = Hpv, (20.3) 
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where Hp is the differential Hamiltonian operator (to be distinguished from 
the field theoretical Hamiltonian below). The conjugate equation is found by 
varying the action with respect to y and may be written as 


Winey" 9, +m + V) =0, (20.4) 
or in terms of the differential Hamiltonian operator Hp, 
—ihc(dop) = vy Hpy”. (20.5) 


The free Dirac equation may be viewed as essentially the square-root of the 
Klein—Gordon equation. In the massless limit, the linear combination of 
derivatives is a Lorentz-scalar-representation of the square-root of O . This 
may be verified by squaring the Dirac operator and separating the product of 
y-matrices into symmetric and anti-symmetric parts: 


OP = y“y” U0 (20.6) 
1 v 1 v 
oe (Oy 0g + tae Y" 109, (20.7) 
uv 1 Hy ,v 
= — 8! ðs + 57"Y" [8p ð] (20.8) 
Lai. (20.9) 


The commutator of two partial derivatives vanishes when the derivatives act 
on any non-singular function. Since the fields are non-singular, except in the 
presence of certain exceptional interactions which do not apply here, the Dirac 
operator can be identified as the square-root of the d’ Alambertian. 


20.2 The y-matrices 


In order to satisfy eqn. (20.9), the y-matrices must satisfy the relation 


{y*, y"} = 28” (20.10) 
(PY = -yP =I. (20.11) 


The matrices satisfy a Clifford algebra. The set of matrices which satisfies 
this constraint is of fundamental importance to the Dirac theory. They are not 
unique, but may have several representations. The form of the y” matrices 
is dependent on the dimension of spacetime and, since they carry a spacetime 
index, on the Lorentz frame [61, 103]. 

Products of the y“ form a group of matrices [,, where a = 1,...,dg, and 
the dimension of the group is dg = 2*. The elements I’, are proportional to 
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the unique combinations: 


Pe PVs Pe. iy?y?, ye iy?y 
iy yy, ivy? 


Factors of i have been introduced so that each matrix squares to the identity 
(see ref. [112]). These may also be grouped differently, in the more suggestive 
Lorentz-covariant form: 


1 scalar 
y” — vector 
Ou, anti-symmetric tensor 
yy" pseudo-(axial) vector 
y pseudo-scalar 


where 
1 
o = —[y",y"]. (20.12) 
2i 


For each T4, with the exception of the identity element, it is possible to find a 
suitably defined T4, such that 


Tarra =-T, OF 1). (20.13) 
By taking the trace of this relation, and noting that 
Tre bra) = Tr(a raro) = Trp) (20.14) 
one obtains 
Trd,) = — Tr(T,) = 0. (20.15) 


From this, it follows that the 2”+! elements are linearly independent, since, if 
one attempts to construct a linear combination which is zero, 


> Aala = 0, (20.16) 
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then taking the trace of this implies that each of the components A, = 0. 
This establishes that each component is linearly independent and that a matrix 
representation for the y” must have at least this number of elements, in order 
to satisfy the algebra constraints. This divides the possibilities into two cases, 
depending on whether the dimension of spacetime is even or odd. 

For even n+ 1, the y“ are most simply represented as dp x dg matrices, where 


dge = 2"+D/2, (20.17) 


d; then contains exactly the right number of elements. Although the matrices 
are not unique (they can be transformed by similarity transformations), all such 
sets of matrices of this dimension are equivalent representations. Moreover, 
since there is no redundancy in the matrices, the dp x dp representations are 
also irreducible, or fundamental. In this case, the identity is the only element 
of the group which commutes with every other element (the group is said to 
have a trivial centre). Another common way of expressing this, in the literature, 
is to observe that other matrices, typically y°y!...y”, anti-commute with 
an arbitrary element y“. There are thus more elements in the centre of the 
group than the identity. This is a sign of reducibility or multiple equivalent 
representations. 

For odd n + 1, it is not possible to construct a matrix with exactly the 
right number of elements. This is a symptom of the existence of several 
inequivalent representations of the algebra. In this case, one must either 
construct several sets of smaller matrices (which are inequivalent), or combine 
these into matrices of larger dimension, which are reducible. The reducible 
matrices reduce to block-diagonal representations, in which the blocks are the 
multiple, inequivalent, irreducible representations. In this case, the identity is 
not the only element of the group which commutes with every other element 
(the group is said to have a non-trivial centre), and the matrix y°y!...y” 
anti-commutes with an arbitrary element y“. Spinors in n + 1 dimensions are 
discussed in ref. [10]. 


20.2.1 Example:n+1=4 


In 3 + 1 dimensions, the dimension of the algebra is 24 = 16, and one has the 
standard representation, 


—1 , —oi 
Fal ù o mala A (20.18) 


where a’ are the Pauli matrices, defined by 


0 1 0 —i 1 0 
aea gesl] Pes P (20.19) 
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The product of two y-matrices may be evaluated by separating into even and 
odd parts, in terms of the commutator and anti-commutator: 


1 1 
Yuy = z Yn Wi F z Yv} 
= iO pv — Suv, (20.20) 


i ( s ) 
v . [ ’ vi. 20 21 


The product of all the y’s is usually referred to as y5, and is defined by 


y’ ~~ iy’yly?y3 


0 =I 
= ò ). (20.22) 


Clearly, this notation is poorly motivated in spacetime dimensions other than 
3+ 1. In 3 + 1 dimensions, it is straightforward to show that 


yet, ty") =0. (20.23) 


The cyclic nature of the trace can be used together with the anti-commutativity 
of y5 to prove that the trace of an odd number of y-matrices vanishes in 3 + 1 
dimensions. To see this, one notes that 


Tr(ysAys !) = Tr(A). (20.24) 
Thus, choosing a product of m such matrices A = y4% ... Yo, Such that 
ysA = (—1)™ Ays, (20.25) 
it follows immediately that 
Tr(A) = (-1)" Tr(A), (20.26) 


and hence the trace of an odd number m of the matrices must vanish. The 
hermiticity properties of the matrices are contained by the relation 


ye ay yey”, (20.27) 
which summarizes 


yo =y’ (20.28) 
yi = -—y'. (20.29) 
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20.2.2 Example: n+ 1 = 3 


In 2 + 1 dimensions, the dimension of the algebra is 2? = 8, and thus the 
minimum representation is is terms of either two irreducible sets of 2 matrices, 
or a single set of reducible 4 x 4 matrices, with redundant elements. 

The fundamental 2 representation is satisfied by 


y°=03, y'= io; (20.30) 


fori = 1,2. This representation breaks parity invariance, thus there are two 
inequivalent representations which differ by a sign. 


primy" (20.31) 
The 4 representation is a symmetrized direct sum of these, padded with zeros: 
pa (rO 0 
HA = > |. 20.32 
yl) ( ot cate, (20.32) 


The matrices of the 2 representation satisfy 


yey? = =g" — ie" y" (20.33) 
Try’ yy) = 2ie"”’ (20.34) 
ys = y°y'y? (20.35) 

[ys, Yn = 0, (20.36) 


where the first of these relations is found by splitting into a commutator and 
anti-commutator and using the su(2) Lie algebra relation for the Pauli matrices: 


Loj, oj] = 2i€ijkOk. (20.37) 
In the 4 representation in 2 + 1 dimensions the results are as for 2 except that 

Tr(y“y”y’); = 0. (20.38) 
Note that the product of all elements is usually referred to as y3 in the literature, 


rather than y*, by analogy with the (3 + 1) dimensional case. 


Cay = yf =iy°y'y 
0 


=i 
=] 
=( aap! ). (20.39) 


Since this is a multiple of the identity matrix, it commutes with every element in 
the algebra. Thus there are two elements to the centre of the group: J and —/. 
The centre is the discrete group Z2, and the complete fundamental representation 
of the algebra is 


2 


y (2) 8 Z2. (20.40) 


436 20 The Dirac field 


These two inequivalent representations correspond to the fact that, in a two- 
dimensional plane, spin up and spin down cannot be continuously rotated into 
one another (not even classically), and thus these two physical possibilities are 
disconnected regions of the rotation group. In much of the literature on two- 
dimensional physics, it is common to adopt either a spin up, or spin down 2 x 2 
representation for the y-matrices, not the complete 4 representation. 


20.3 Transformation properties of the Dirac equation 


Consider a Lorentz transformation of the Dirac spinor by a matrix representation 
of the Lorentz group: 


W'a = SL) W(x) = Lr VQ). (20.41) 


The matrix, usually denoted S(L) in the literature, is just an example of 
a non-adjoint representation of the Lorentz group from section 9.4.3. This 
representation has to carry spinor indices a, 6, which are suppressed above, 
in the usual way. These spinor indices correspond to the representation indices 
A, B of section 9.4.3. 

A transformation of the free Dirac equation may be written as 


(y" pic + mc?) '(x') =0 (20.42) 
(y (LDE pe +mc?)(Lrw(x)) = 0, (20.43) 

where one recalls that 
pat pS pee (le Ve De (20.44) 


Multiplying on the left hand side by L}', and comparing with the untransformed 
equation, leads to a condition 


Lr y" Lrp=Lty’, (20.45) 


which is an identity, provided L = Laaj, the adjoint representation of the group. 
The infinitesimal form of the spinor representation may be written in terms of 
the generators of this representation Tp: 


Lr = S(L) =I +0°Tẹ, (20.46) 
or, with spinor (representation) indices intact, 

S(L)%_ = 5°, + O° (Trp. (20.47) 
Consider an infinitesimal transformation 


x = x" + eat x’, (20.48) 
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so that 
LrJ+¢€o) =1+€Tp. (20.49) 
The adjoint transformation can thus be expressed in two equivalent forms: 
Sty"S =(1-eT)y"(1 — eT) 
=y" +e(y*T —Tyy) 
= (Lagi) (1 + Ew) Y” 
= y" + ewf y”. (20.50) 


Thus, 
y“T —- Ty" = œf y”, (20.51) 


which defines T up to a multiple of the identity matrix. Choosing unit 
determinant det(Z + €T) = 1 + e TrT, we have that Tr T = 0, and one may 
write 


1 
(Tr)“ = go Vav = YvYu) g» (20.52) 


or, compactly, 


(Tr) = Toys. (20.53) 


20.3.1 Rotations 


An infinitesimal rotation by angle € about the x! axis has 
o” = —w = 1, (20.54) 
and all other components zero. Thus the generator in the spinor representation 
is 
1 


T' = zY (20.55) 


and the exponentiated finite element becomes 
S(R;) = ef1Tr — eo 2921, 
0 _ oO 
= I cos—=+1)sin =, (20.56) 
2 2 
where 


=, O71 0 
yy =( ae ). (20.57) 


The half-angles are characteristic of the double-valued nature of spin: 


S(O, + 277) = —S(01). (20.58) 
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20.3.2 Boosts 


For a boost in the x! direction, w!° = ~œ”! = 1, 
1 1 0 o 1 
S(B\) = =<y®y! == 1} = a. 20.59 
(Bı) nV “(3 a) z” ( ) 


The finite, exponentiated element is thus 
a a . Q 
S(B,) = e7% = I cosh 5 + a, sinh 5° (20.60) 


where tanha = v/c. Notice that the half-valued arguments have no effect on 
translations. 


20.3.3 Parity and time reversal 


The meaning of parity invariance is intrinsically linked to the number of 
spacetime dimensions, since an even number of reflections about spatial axes is 
equivalent to a rotation, and is therefore simply connected to the infinitesimally 
generated group. In that case, spatial reflection is defined by a reflection in an 
odd number of axes. In odd numbers of spatial dimensions, reflections in all axes 
lead to a ‘large’ transformation which cannot be generated by exponentiated 
infinitesimal generators. 
Consider the case in 3 + 1 dimensions. For a space inversion, one has 


is ipeay® 
LR y Lr = —y'. (20.61) 

Thus, the parity transformation can be represented by: 
S(P) =e? y® (20.62) 


in Dirac space. This exchanges the upper and lower spinor contributions. 
Similarly, a time inversion 


Lg y Lr = —y’ 
Lp y Lpr =y' (20.63) 
can be given the form 


S(T) =e? y5. (20.64) 


20.3.4 Charge conjugation 


Charge conjugation transforms a positive energy solution with charge q into 
a negative energy solution with charge —q. One searches for a unitary 
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transformation C with the following properties: 
Cycl=ncy 
C A, C = —A,. (20.65) 


7 is a possible intrinsic property of the field, n? = 1. The two features of this 
transformation are that it exchanges positive for negative energies and that it 
reflects the vector field like an axial vector. Since A, always multiplies the 
charge q, in the covariant derivative, this is equivalent to changing the sign of 
the charge. The action for the gauged Dirac equation is (i = c = 1), 


S= fa Viy"D, +m)y, (20.66) 


where D,, = 0, +iqA,. In order to find a transformation which exchanges y 


with Y, one begins by integrating by parts: 
S= J (dx) Y(~iy" (0, —igA,) + m)y, (20.67) 


then, taking the transpose: 
T ST pat s —T 
S= fo w (iy “Ou -igA,))W . (20.68) 


This has almost the same form as the original, untransposed action, with 
opposite charge. In order to make it identical, we require a matrix which has 
the property 


C y™ CO! = y". (20.69) 


Introducing such a matrix, one has 
s= J (dx) TCD Gy" D} + m)\(CF') 
= J (dx) Y Gy" DY + m)V°, (20.70) 


where the charge conjugated field is Y° = C y`. 

The existence of a matrix C, in 3 + 1 dimensions, possessing the above 
properties can be determined as follows [112]. Taking the transpose of the 
Clifford algebra relation, 


{y™, y™} = 28”, (20.71) 


one sees that the transposed y-matrices also satisfy the algebra, and must 
therefore be related to the untransposed ones by a similarity transformation 


y™ = B7! y” B. (20.72) 


440 20 The Dirac field 

In 3+ 1 dimensions, the 4 x 4 y-matrices are irreducible, and thus the existence 
of a non-singular, unitary B is guaranteed. Taking the transpose of eqn. (20.72) 
and re-using the relation to replace for y”, one obtains 


y™ = (B'B") y™ (B-''B), (20.73) 


thus establishing that B~' B™ commutes with all the y-matrices. From Schur’s 
lemma, it follows that this must be a multiple of the identity: 


BB" =cl. (20.74) 


Taking the inverse and then the complex conjugate of this relation, one finds 


1 T)-1 
= = BB)" 
= B* B** (20.75) 
= B*B. (20.76) 
where we have used the unitarity BB = J. This means that c is real, and 
furthermore that c = +1, ie. 
B = +B", (20.77) 


so, from this, the matrix is either symmetrical or anti-symmetrical. An addi- 
tional constraint comes from the number of symmetrical and anti-symmetrical 
degrees of freedom in the 4 x 4 y-matrices. If B is anti-symmetric, then the 
six matrices y” B, y5 B, B are also anti-symmetric, whereas the ten matrices 
By*y", Bo" are symmetrical. This matches the number of anti-symmetrical 
degrees of freedom in a 4 x 4 matrix representation. Conversely, if one takes 
B to be symmetrical, then the numbers are reversed and it does not match. One 
concludes, then, that B is an anti-symmetric, unitary matrix. This result was 
shown by Pauli in 1935. It has now been shown that it is possible to construct 
a similarity transformation which turns y-matrices into their transposes. The 
matrix we require is now 


C = —iy°B. (20.78) 
With this definition, we have 


C7! y” C = B! iys yt iysB = —B-!y"B 
=—-yl™, (20.79) 


C is thus a charge conjugation matrix for Dirac spinors. 
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20.4 Chirality in 3 + 1 dimensions 


The field equations of massless spinors are, from eqn. (20.101), 


— Pot —o' pi x1 \=0 (20.80) 
o'pi Po X2 ` 
or 
o' Îi XL = 2AXL = — Xt 
o' Îi XR = 2À = +XR, (20.81) 


where A = toi Di, and 
XL = X + x2 
XR = X1 — X2- (20.82) 


These equations are referred to as the Weyl equations, and x, and xp are Weyl- 
spinors. For such massless particles, the eigenvalue of y° is referred to as the 
chirality of the solution: 


yu(p, A) = 2du(p, A) 
you(p, A) = —2Av(p, A). (20.83) 


A projection operator for the chirality states is thus 


D, = xt Ey (20.84) 


Particles with helicity +4 are referred to as right handed, while particles with 
helicity -i are referred to as left handed. Only left handed neutrinos interact 
by the weak interaction and appear in the Standard Model. Symmetry under the 
continuous transformation 


W(x) > ey (20.85) 


is known as chiral symmetry. 


20.5 Field continuity 


The variation of the action leads to surface terms, 


h J do” (Spy), (20.86) 


for y variations, and 


h J do” (Yy ôy), (20.87) 
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for y variations. This provides us with definitions for the conjugate momenta 
across spacelike hyper-surfaces o: 


ATL, =yo'4, (20.88) 
for the variable conjugate to y, and 
4 = Ay yo, (20.89) 


for the variable conjugate to y. The canonical values for these momenta are 


O = yw = yi. (20.90) 


20.6 Conserved norm and probability 


The linear nature of the Dirac action implies that the conserved current is 
independent of derivatives. This means that the sign of the energy iħco cannot 
change the sign of the conserved probability, thus the Dirac equation does not 
suffer the problem of negative norms or probabilities as does the Klein—Gordon 
equation. 

To determine the conserved current, one considers the effect of an infinitesi- 
mal x-independent phase transformation ô: 


as = f(a {Ge isoye) 


+ (he "(ids)", ((—ids)e"W)) 
— (ð (we * (—ids))y"(e* W)) 
- Oei (—i8s))y"(—idsye"W))| (20.91) 


Integrating by parts to remove derivatives from ôs, and using the equations of 
motion, one arrives at the simple expression 


6S =h J do” (Yy, y)ôs, (20.92) 


which defines a conserved current 6S = fdo“J,,5s. This motivates the 
definition of an inner product given by 


1 a need 
(wa) =- J TA ET (20.93) 
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giving the norm of the field as 


Wie J do Fynt. (20.94) 


The canonical interpretation of this is 


1 5 — 
(a) =- J WG nA. (20.95) 


which means that the norm may also be written 


Cpe J AE J doyty. (20.96) 


The norm of the field is defined separately on the manifold of positive and 
negative energy solutions. 


20.7 Free-field solutions in n = 3 
The free-field equation is 
(—iħcy ð, + mc*)agWp(x) = 0, (20.97) 


where wWa(x) is a 2/-component vector for some / > n/2, which lives on spinor 
space (usually these indices are suppressed). In a given number of dimensions, 
we may express this equation in terms of a representation of the y-matrices. In 
three dimensions we may use eqn. (20.18) to write 


ihd, +mc?  iħco'ð; E 
( —ihico'd; —iħð, + mc? ) y =0, (20.98) 


where we suppress the a, 6 spinor indices. The blocks are now 2 x 2 matrices, 
and the spinor may also be written in terms of two two-component spinors u: 


v= ( A J (20.99) 


u? 


If we transform the spinors to momentum space, 


= PUE ge k 20.100 
va) = | Se wk), (20.100) 


then the field equations may be written as 


= 2 erip: 
( poc + mc co pi )( ui Jan (20.101) 


j 2 
CO" pi Poc + mc uz 
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where p, = fk,,. This matrix equation has non-zero solutions for y only if the 
determinant of the operator vanishes. Thus, 


det = pe +m’c* = 0. (20.102) 
Here one makes use of the fact that 
(oi pi)? = o't pip; 
Lo, oi] + ={0',0/) 
= | -[o', o {0,0 iD; 
A P yey 
= (ieo, + 8") pip; 
= p' pi. (20.103) 


Eqn. (20.102) indicates that the solutions of the Dirac equation must satisfy the 
relativistic energy relation. Thus the Dirac field also satisfies a Klein—Gordon 
equation, which may be seen by operating on eqn. (20.97) with the conjugate of 
the Dirac operator: 


Ghcy”ð, + mc*)(—ihcy"d, + mc*)w(x) = 0 
(AO +m7)v =0. (20.104) 


The last line follows from eqn. (20.20). The vanishing of the determinant also 
gives us a relation which will be useful later, namely 


2 = ni, 
Leas een ie (20.105) 
co! pi (—poc + mc?) 
The 2 x 2 components of eqn. (20.101) are now 
(—poc + me*)u; — c(o' pi)uz = 0 
c(a' pi)ui + (poc + mc°)uz = 0, (20.106) 


which implies that the two-component spinors u are linearly dependent: 


oP), 
i (—poc + mc?) E 
2 
peaa ED (20.107) 
c(o' pi) 


The consistency of these apparently contradictory relations is secured by the 
determinant constraint in eqn. (20.105). 

In spite of the linear (first-order) derivative in the Dirac action, the deter- 
minant condition for non-trivial solutions leads us straight back to a quadratic 
constraint on the allowed spectrum of energy and momenta. This means that 
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both positive and negative energies are allowed in the Dirac equation, exactly 
as in the Klein—Gordon case. The linear derivative does cure the negative 
probabilities, however, as we show below. 

The solutions of the Dirac equation may be written in various forms. A direct 
attempt to apply the field equation constraint in a delta function, by analogy 
with the scalar field, cannot work directly, since the delta function cannot have 
a matrix argument. However, by introducing a projection operator (—y“p,,c + 
mc?) /2| pol, it is possible to write 


d'tle 
(27 )"+!1 


W(x) = e (—y" puc +mc?)5(p?c? + m’c*) u(k), 


(20.108) 


where p = Ak and u(k) is a mass shell spinor. The projection term ensures that 
application of the Dirac operator leads to the squared mass shell constraint. By 
inserting 0(+ko) alongside the delta function, one can also restrict this to the 
manifold of positive or negative energy solutions, i.e. 


d”+!k i 
via) = Onn e6 (Fko) (—y" puc +m) (pc? + mc) us(k), 
ah T (k) (20.109) 
= e ——y Uus(k), ; 
CaP kol ** 
since 


(—y" puc +m) = 2m, (20.110) 


when p, is on the mass shell y“ p,,c + mc” = 0. In the literature it is customary 
to proceed by examining the positive and negative energy cases separately. As 
we shall see below, solutions of the Dirac equation can be normalized on either 
the positive or negative energy solution spaces. 

It is more usual to consider positive and negative energy solutions to the Dirac 
equation separately. To this end, there is sufficient freedom in the expression 


n+1 
L(x) -| ae e*s (p°? + med AEko ( in ) ne) 


(27)"+1 
dk e-on co! pi 

2 E | etme |) Nu. (20.111) 
2x)" 2|E|ch 1 


The two-component spinors u are taken to be a linear combination of the spin 
eigenfunctions for spin up and spin down, as measured conventionally along the 


Z axis 
1 0 
w=a(4)ta($). (20.112) 
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where i = 1, 2 and c? + c2 = |. Unlike, the case of the Klein-Gordon equation, 
both positive and negative energy solutions can be normalized to unity, although 
this is not necessarily an interesting choice of normalization. An example: 
consider the normalization of the positive energy solutions (— poc = E), 


1 = (WP), WM a) 
d”+!k d'tip eilk-k)x 2(~i í 2 
T gl ut (PO +1) lu 
— (27 )rt! 4E2C2h (E + mc?) 


(20.113) 


2E 
= = E+mc?)- 


Assuming a box normalization, where d’k/(277)” ~ L~" $, we have 


N, = 2L" ehE +mc?), (20.114) 
and hence 
. 2 co! pi 
WO (k) = LP io | (E Eme) ) Œ+m® | x, (20.115) 
2E 1 
where 


eza] , ra (20.116) 


20.8 Invariant normalization in p-space 


The normalization of Dirac fields is a matter of some subtlety. Different 
invariant normalizations are used for different purposes. The usual case is to 
consider plane wave solutions, or wave-packets. Consider the probability on a 
spacelike hyper-surface, transforming as the zeroth component of a vector: 


= d”t!k $ 
Tombs) = f Sa u OE) 


x (—y" puc + mc?) (pc? + m?c’) 


J ae i (k)tuck) = 1 (20.117) 
= u(k)'u(k) = 1. : 
(21)" 2| pol 

The factor of 2mc7/2|po| is required to ensure that the spinors satisfy the 
equations of motion for the Dirac field. This indicates that the invariant 
normalization for the spinors should be 


Tornas? ee (20.118) 
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Consider now what this means for the product u (k)u(k). The field equations for 
these, in momentum space, are: 


(v°poc + y' pic + me*)u(k) = 0 
u(k)(y° poc + yt pic + mc’) = 0. (20.119) 


Multiplying the first of these by uw’, and using the fact that 7 = u'y®, gives: 
u(y poc + y°y' pic + me*y”)u(k) = 0. (20.120) 


Now, multiplying the second (adjoint) equation on the right hand side by y°u 
and commuting y° through the left hand side, one has: 


U(poc — y?y’ pic + me-y°)u(k) = 0. (20.121) 
Thus, adding eqns. (20.121) and (20.120), leaves 
2poc uu + 2mc? u'u = 0. (20.122) 
Taking the normalization for u'u in eqn. (20.118), we find that 


= —Po E 
uu = — = —., 
Ipol = |E| 


Thus, a positive energy spinor is normalized with positive norm, whilst a 
negative energy spinor has a negative norm, in momentum space. It is custom to 
refer to the positive and negative energy spinors as u(k) and v(k), respectively. 
Accordingly, one takes the invariant normalization to be 


(20.123) 


T, U, = —8rs, (20.124) 


with spinor indices shown. 


20.9 Formal solution by Green functions 


The formal solution to the free equations of motion (V = 0) may be written 


y(x) = [se x) I(x’), (20.125) 


and the conjugate form 


Wi(x) = foxrserse x). (20.126) 
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20.10 Expressions for the Green functions 


The Green functions can be obtained from the corresponding Green functions 
for the scalar field; see section 5.6: 


(—iħcy“ð, + mc’) (ihicy" 3, +m’) = 


1 
eo +m + ziy“ y”lðp ð, (20.127) 


and the latter term vanishes when operating on non-singular objects. It follows 
for the free field that 


(Għcy”ð, + mc?)G™ (x, x’) = SPx, x’) (20.128) 
(hicy"d, + mc?)Gg(x, x’) = Spx, x’) (20.129) 
(—iħcy”ð, + mc?) S* (x, x)= 0 (20.130) 
(—ihicy"d,, + mc*)Sp(x, x’) = 8(x, x’). (20.131) 


20.11 The energy-momentum tensor 


The application of Noether’s theorem for spacetime translations leads to a 
symmetrical energy-momentum tensor. In accordance with the other fields, the 
zero—zero component of the energy-momentum tensor has the interpretation 
of an energy density or Hamiltonian density. This is to be distinguished from 
the differential Hamiltonian operator, which generates the time evolution of the 
field. We have, 


Ooo = a —~ (dow) + Oi ae L800 


= lic  — 
= — Fra) + = OPY +L. (20.132) 


Using the equation of motion (20.2), the integral of this quantity over all space 
may be written as 


f iow = [so iney'a +mc?+V)w 

= (V, Hoy), (20.133) 
where we have used (y°)? = 1. This expression is formally the expectation 
value of the differential Hamiltonian operator, but it is also used as the definition 
of a ‘field theoretical’ Hamiltonian. In the second quantization, where the fields 
are operator-valued, this expression is referred to as the Hamiltonian operator 


and may be thought of as generating the time evolution of the fully quantized 
field. 
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The spacetime components of the energy-momentum tensor are not explicitly 
symmetrical. This is a consequence of the linear derivative in the field equation 
(20.2). However, the off-diagonal components can be shown to be equal 
provided the field satisfies the hid of motion. We have 


boi = ———~ (0; W) + (0, ¥) —— 


a ah 
= lic  — 
= Taw + SoDww. (20.134) 


Taking the integral over all space allows us to integrate by parts, giving 


fioo = -ihe [ doFyaiw 
= —(Y, picw), (20.135) 


where p; = —ifo;. Thus, this component is identified with the momentum in 
the field. Switching the order of the indices, we have 


aL 
Ooi = 
i ate ae as Pag Wp) 


Flan) gis + OPY (20.136) 


This is clearly not the same as eqn. aan However on using the field 
equation and its conjugate in eqns. (20.2) and (20.5), it may be shown that 


n hem en ae ay a yis ee ee ET 
Oio = 5 Winiyv y’ dj =y” dj y WW zY mc + Viy, 
(20.137) 


so that the integral over all space can be integrated by parts to give 


ic — ; 1 Z 
[ 2040 = fo EAT yi = sr vone + vzv} 


(20.138) 
On using the anti-commutation relations for the y-matrices, we find 
[ 20% = —iħc J do Yy div = [ 2070. (20.139) 
The diagonal space p are given by 
Oui = r iW) + CW) 
w 7 v z 


= -Fp aw +L, Oreo) 
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where i is not summed. The off-diagonal space components are 


o = a) + We 
"apy ” aay) 
lic — = 
=> (Prdy -— OW). (20.141) 


where i # j. Although not explicitly symmetrical in this form, the integral over 
all space of this quantity is symmetrical by partial integration. Note that the 
trace of the space components is given in n + 1 dimensions by 


X 0i =H + (me? + Vw + DEL, (20.142) 


so that the total trace of the energy-momentum tensor is 
0! = gO, = (mc? + Vy +- DEL. (20.143) 


This vanishes for m = V = 0 in 1 + 1 dimensions. 


20.12 Spinor electrodynamics 


The action for spinor electrodynamics is 
= 1. 7 > j <- İ E 
Sæ = | (dx){¥(- sine(y" D, —y" Du )+me )y 
4 pep | (20.144) 
410 mel l 
This is the basis of the quantum theory of electrodynamics for electrons (QED). 


Pauli [104] has shown that the Dirac action may be modified by a term of the 
form 


a uc? pà 
S —> S+ | (dx) Uae Faw, (20.145) 


whereupon the field behaves as though it has an additional (anomalous) mag- 
netic moment efi/2m. Later, Foldy investigated generalizations of the Dirac 
action which preserve Lorentz invariance and gauge invariance [51]. One 
makes two restrictions: linearity in A, (weak field) and finiteness in the zero 
momentum limit (independent of 3, Y). The result is 


S> $+ [oot bs (« i y"Au + Epo" Fe) y, 


i=0 


(20.146) 
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where œ;, 8; are constants representing anomalous charge and magnetic mo- 
ments respectively. 

There is a number of problems for which spinor electrodynamics can be 
solved exactly. These include: 


e the spherically symmetrical Coulomb potential [31, 38, 62, 75, 98]; 
e the homogeneous magnetic field [73, 81, 106, 109]; 
e the field of an electromagnetic plane wave [131]. 


A review of these is given in many books. See, for example, ref. [8]. 


21 
The Maxwell radiation field 


There are two ways of describing the interaction between matter and the 
electromagnetic field: the first and most fundamental way is to consider the 
electromagnetic field to be coupled to every individual microscopic charge in 
a physical system explicitly. Apart from these charges, the field lives on a 
background vacuum. The charges are represented by an (n + 1) dimensional 
current vector J,,. 

In systems with very complex distributions of charge, this approach is too 
cumbersome, and an alternative view is useful: that of an electromagnetic field 
in dielectric media. This approach is an effective-field-theory approach in which 
the average effect of a very complex, on average neutral, distribution of charges 
is taken into account by introducing an effective speed of light, or equivalently 
effective permittivities and permeabilities. Any remainder charges which make 
the system non-neutral can then be handled explicitly by an (n + 1) dimensional 
current vector. Although the second of these two approaches is a popular 
simplification in many cases, it has only a limited range of validity, whereas 
the first approach is fundamental. We shall consider these two cases separately. 


21.1 Charges in a vacuum 
21.1.1 The action 


The action for the electromagnetic field in a vacuum is given by 


1 
s= fw f rwr sk (21.1) 
4uo j 
where the anti-symmetric field strength tensor is defined by 
Fav = OyAy — Az, (21.2) 
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and 


if $ 
At =( rn ): (21.3) 


In 3 + 1 dimensions, the field components are given by 


E; = —d;d — 0,;A; =c Fio 


Eijk Bk = Fi;, (21.4) 
where i = 1, 2,3. The latter equation may be inverted to give 
1 
B; = 5 Stik Pik (21.5) 


Note that the indices on the electric and magnetic field vectors in 3 + 1 
dimensions are always written as subscripts, never as superscripts. In 2 + 1 
dimensions, the magnetic field is a pseudo-scalar, and one has 


E; = —d;6 — 0,A; = c Fio 
B= Fy, (21.6) 
wherei = 1, 2. In higher dimensions, the tensor character of F, is unavoidable, 
and E and B cease to lose their separate identities. A further important point is 
that the derivatives in the action are purely classical — there are no factors of fi 
present here. 


A phenomenological source can be added to the Maxwell action, in an 
ambient vacuum: 


s= fan] F” Fp- JA. 21.7 
f E uv .| (21.7) 


This describes an electromagnetic field, extended in a vacuum around source 
charges. What we really mean here is that there is no ambient background matter 
present: we allow positive and negative charges to exist freely in a vacuum, but 
there is no overall neutral, polarizable matter present. The case of polarization 
in the ambient medium is dealt with later. 


21.1.2 Field equations and continuity 


The variation of the action leads to 


ôS = fæ {(045A") Fuy — J 8A p} 
= fæ {8 A" (—3" Fu) — J 8 Ap} 


+ foo [8A" Fu}. (21.8) 
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Thus, the field equations ôS = 0 are given by 
3LF = — uoJ”, (21.9) 

and the continuity condition tells us that the conjugate momentum (do” = do?) 
is 

IM; = Foi. (21.10) 
If the surface ø is taken to separate two regions of space rather than time, one 
has the standard continuity conditions for the electromagnetic field in a vacuum: 

AF;io = (0) 

AF;; =0, (21.11) 


and we have assumed that 5A, is a continuous function. The momentum 
conjugate to the field A, is 


he = oe a F, (21.12) 
EO SAn lE l 
where o points outward from a spacelike hyper-surface. The canonical choice 
for this momentum is øo = 0, where one has 


ll, = of =f (21.13) 
bh 5(09 AH) = FO0i> . 
which means that u can only take values i = 1,..., in n spatial dimensions, 


owing to the anti-symmetry of Fv. 
The velocity analogous to q is given by the derivative of the field 3s A” = 
ðo A”. Thus, the canonical definition of the Hamiltonian is 


1 
H = Fo,,(d9A") — 8u g E” Fov (21.14) 
0 


However, this expression is not gauge-invariant, whereas the Hamiltonian must 
be. The problem lies in the naive interpretation of the Legendre transform. The 
problem may be cured by defining the Hamiltonian in terms of the variation of 
the action: 


H=-—. (21.15) 


This is a special case (the zero—zero component 69) of the energy-momentum 
tensor, which is discussed below in more general terms. The result for the 
Hamiltonian density is 


1 
H= ; (e0E;E; + ug BiB;) (21.16) 


where i = 1,..., n. 
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21.1.3 The Jacobi—Bianchi identity 


The Bianchi identity in n + 1 dimensions provides two of Maxwell’s equations. 
The equations implied by this identity are different in each new number of 
dimensions. In 3 + 1 dimensions, we have 


elo = 0. (21.17) 
Separating out the space and time components of u, we obtain, for u = 0, 


e'ta; Fix = 0' B; = divB = 0. (21.18) 


For u = i, i.e. the spatial components, we have 
eik oF jk + ey Foj + M9 Fio = 0, (21.19) 
which may be re-written as 
1 9 
2-9, B; — e 3p Fo; + e43; Fro = 0. (21.20) 
F . i 
Thus, using the definition of the electric field in eqn. (21.12), together with the 


anti-symmetry of F,,, we obtain 


OB; 


1E); ===; 
(curl E) T 


(21.21) 


which completes the proof. 
In 2 + 1 dimensions, eqn. (21.18) is absent, since the Bianchi identity now 
has the form 


eo Fyn = 0. (21.22) 
The full expansion of this equation is 
CF Oo Fig + Ol ap Fo; + e703; Fro = 0, (21.23) 
which can be written as 
e} 3j Er = a (21.24) 


Note that, in 2 + 1 dimensions, the B field is a pseudo-scalar. 


21.1.4 Formal solution by Green functions 


The formal solution to the equations of motion is most conveniently expressed 
in terms of the vector potential. Re-writing the field equation (21.9) in terms of 
the vector potential, we have 


—O Ay +0"0,Ay = Mody (21.25) 
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or 


(—O 64, + 0" d,)Ay = body. (21.26) 


The formal solution therefore requires the inverse of the operator on the left 
hand side of this equation. This presents problem, though: the determinant of 
this matrix-valued operator vanishes! This is easily seen by separating space 
and time components as a 2 x 2 matrix, 


—O + 40° Ad! 
0,0° -0O + 0,0! 


= 0. (21.27) 


The problem here is related to the gauge symmetry, or non-uniqueness, of A,, 
and can be fixed by choosing a gauge for the potential. The choice of gauge is 
arbitrary, but two conditions are required in general to fix the gauge freedom 
fully (see chapter 9), and ensure a one-to-one correspondence between the 
potentials and the physical fields. 


21.1.5 Lorentz gauge 


To solve the inverse problem in the ‘Lorentz gauge’, it is sufficient to take 
Ə A, =0. (21.28) 


This is only a single condition, so it does not fix the gauge completely, but it is 
sufficient for our purposes. Using this condition directly in eqn. (21.26) we get 
the modified field equation, 


-0O Ay = Mody. (21.29) 


This equation now presents the appearance of a massless Klein—Gordon field. 
The formal inverse of the differential operator is therefore the scalar Green 
function G(x, x’), for m = 0, giving the solution 


Ay = po f (ax VGC. x )Ju, (21.30) 


Another way of imposing this condition, which is frequently used in the 
literature, is to add a Lagrange multiplier term to the action: 


1 1 uv H I —l;/gu 2 
S = | (dx) Tare Fuy — J Ay + zoo OAD 13D 


where œ~! is the Lagrange multiplier. The field equations and continuity 
conditions resulting from the variation of the action are now 


E ees (1 = 3 ata | Au = Lod, (21.32) 
v a v H v . 
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and 
A(Fy, + 0,Ay) = 0. (21.33) 


The inverse of the differential operator in eqn. (21.32) is found by solving the 
equation 


1 

|- ôt + (1 — =) 0", | Di Gt’) 6/3, x’), (21.34) 
a 

which gives a formal solution for the potential 


Ay(x) = ho f (ax ysl E, (21.35) 


21.1.6 Coulomb/radiation gauge 


The Coulomb gauge is based on the condition 
Ə A; = 0. (21.36) 


Again, this is only a single condition, and it is usually supplemented by the 
condition Ag = 0, or by the use of the zeroth-component field equation to 
eliminate Ag entirely. Using eqn. (21.36) in eqn. (21.26), we separate the space 
and time parts of the field equations (a step backwards from covariance): 


(—C Ao + (8° A0)) = Ho Jo (21.37) 
(— (88° + V*) Aj + 3:3 Ao + 4:(8°Ao)) = Modi, (21.38) 
or, simplifying, 
—V? Ao = MoJo (21.39) 
—0 A; + 0;(8°Ao) = Lodi. (21.40) 


At this point, it is usual to use the first of these equations to eliminate Ag from 
the second, thereby fixing the gauge completely. Formally, we may write 


J 
—O A; + 4,a° (=) = Modi, (21.41) 


where (—V7)~! really implies the inverse (or Green function) for the differential 
operator —V”, which we denote g(x, x’) and which satisfies the equation 


—V7 g(x, x’) = d(x, x’). (21.42) 


Thus, eqn. (21.41) is given (still formally, but more explicitly) by 


-O A; + 1108;3° J E ETA (21.43) 
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21.1.7 Retarded Green function inn = 3 
In the Lorentz gauge, with œ = 1, we have 
D(x, x) = Si G(x, x^). (21.44) 
From Cauchy’s residue theorem in eqn. (5.76), we have 


, ACE dk elk Ax- or At) eik Ax+okAt) 
G,(x, x’) = —2m1i(fic) 


(27)* 20 20 
(21.45) 
Using the derivation in section 5.4.1, this evaluates to 
1 
G,(x, x’) = ——-—— (ct — AX 
Aae ) 
: 8 (c(t! — tret)) (21.46) 
=F AE C — |, š . 
Anh*c|x — x'| i 
where the retarded time is defined by 
te =t — |x- x]. (21.47) 


Note that the retarded time is a function of the position. 


21.1.8 The energy-momentum tensor 


The gauge-invariant definition of the energy-momentum tensor is (see section 
11.5), 


eae 
wo = gga” Sw 
a 
= Fn v T E8w 
A a eT 
= Ho Fpa F, — g E Fro8uv: (21.48) 
Ho 


This result is manifestly gauge-invariant and can be checked against the tradi- 
tional expressions obtained from Maxwell ’s equations for the energy density and 
the momentum flux. 

The zero—zero component, in 3 + 1 dimensions, evaluates to: 


Ooo = Mo ' (Foi Fo’ — L800) 


EE 1 (pp Bibi 
Seg Dito ke c? 


1 
= —(E?/c? + B’) 
2 uo 


1 
= 5(60E E+ u3 B-B), (21.49) 
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which has the interpretation as an energy or Hamiltonian density. The spacetime 
off-diagonal components are given by 


90; = Ojo = uo FoF; 
= ug €ijk Ei By/c 
_ (Ex HW) 


(21.50) 
c 

which have the interpretation of a momentum density for the field. This vector 

is also known as Poynting’s vector. The off-diagonal space parts are 


i k 
oii = Fip FY = FoF; + FaF; 
E; E; 
aa Lai (21.51) 
cc 


with i Æ j. The diagonal terms with i not summed are 
0 j I pv 
Moi = FioF; + Fij F;” — a Puy 
E? 2_1 2 
c 
The invariant trace of this tensor in n + 1 dimensions is 


OM = pg Fua F” — — F" Fuy, (21.53) 
which vanishes when n = 3, indicating that Maxwell’s theory is conformally 
invariant in 3 + 1 dimensions. 


21.2 Effective theory of dielectric and magnetic media 


This section contains a brief summarial discussion of the effective fields for the 
radiation field in the presence of a passive medium. The dielectric approach to 
electromagnetism in near-neutral media is often used since it offers an enormous 
simplification of very many systems. Its main weaknesses are that it makes 
two assumptions: namely, that the response of background matter is dipole- 
like and linear in the applied fields, and that the background matter is smoothly 
homogeneous throughout a given region. The first of these assumptions breaks 
down for strong fields, and the latter breaks down on very small length scales; 
thus, the theory provided in this section must be treated as a long-wavelength 
approximation to electromagnetism for weak fields. 

Maxwell’s equations in a dielectric/magnetic medium are most conveniently 
written in terms of the dielectric displacement vector D, defined by any one of 
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the equivalent relations 


D = ep6, E + P (21.54) 
= éo(1 + Xe)E (21.55) 
= 6&6 E, (21.56) 


where P is the dielectric polarization and %e is the electric susceptibility, 
described below. The magnetic field intensity H, defined by the equivalent forms 


1 
H=——B_M (21.57) 
Molr 
ee (21.58) 
Moll Xm) 
B 
a, (21.59) 
Mobr 


M is called the magnetization and x,, is the magnetic susceptibility, also defined 
below. In terms of these quantities, Maxwell’s equations take on the form 


V-D=, 
= OB 
Vx E=-— 
ot 
V-B=0 
> . oD 
ÝVxH=j+ (21.60) 


This form of Maxwell ’s equations is valid inside any linear medium. As a further 
point, the energy density of the electromagnetic field is given by 


£= 5E-D+B-¥). (21.61) 


This must agree with the Hamiltonian density. 


21.2.1 The Maxwell action and Hamiltonian in a medium 


To express Maxwell’s equations in covariant form, we had to introduce the fields 
D and H. We should therefore expect that, in the covariant description, we need 
to introduce a new tensor. We shall call this tensor G,,,, and define it by 


0 —cD, —c D, =c D3 
cD, 0 H3 -H> 
cD — H3 0 A, 
cD3 H —H, 0 


Guy = (21.62) 
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We see that this tensor has the same structure as F,,,, but with cD replacing E/c 
and H replacing B. 


In terms of this tensor, we can write the action in the form 


1 
S= fæ [iu — say : (21.63) 
The canonical momentum can be written 
OL 
ll, = —— = Gy. 21.64 
H 3AL) 0 ( ) 


Again, one has the same problem of gauge invariance with the generalized 
‘velocity’ as in the vacuum case. The Hamiltonian is best computed from the 
energy-momentum tensor. It is given by 


1 
H= 5 BA + E;D;)+ J’ Az. (21.65) 


21.2.2 Field equations and continuity 
Using the property of the trace that 


8F”G yy = 8G" Fy, (21.66) 


for linear equations of motion (i.e. G,,, does not depend on F,,,,), we find the 
variation of the action is given by 


ôS = fo {-5A” 3 Gus — J 8A } + fo [5A Gv} = 0.(21.67) 


We see immediately that the field equations are given by 
IG uw =) (21.68) 


which may be compared with eqn. (21.9), and that the continuity condition 
implies that the canonical momentum is (u = 0) 


T, = Dop, (21.69) 


and that the condition for continuity across a surface dividing two regions of 
space is u = i divides into two cases, 

ADiog = AD = 0 

AD;; = AH = 0. (21.70) 


These are the well known continuity conditions for the field at a dielectric 
boundary. 
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In terms of the vector potential, we can write the field equations 
c'e [33A ) + VA} + JP =0 
ce {90°A‘ — d! (8V Ao) } + EN —a'(a;A/)) + J! =0. 
(21.71) 
This ugly mess can be compared with eqn. (21.26) for the vacuum case. At first 


sight, it appears that covariance is irretrievably lost in these expressions, but this 
is only an illusion caused by the spurious factors of € and u as explained below. 


21.2.3 Reinstating covariance with c > c/n 


To reinstate covariance, we note that the introduction of a modified gauge 
condition helps to unravel the equations: 


c'eu(ðAo) + 0;A/ = 0. (21.72) 
This can also be written as 
n’(a°Ao) + 0;A/ = 0, (21.73) 
which suggests that we re-define the derivative as 
= (a, v) (21.74) 


In terms of this equation, the field equations now combine to give 


-Ô A" = uJ”. (21.75) 


To re-write the gauge condition in terms of this new derivative, we must also 
define A,, and J,,, replacing c by c/n in each case: 


At = ( cÊ ) jee ( na k (21.76) 


Then we have the complete, covariant (n + 1)-vector form of the field equations 
in a medium: 


a“ A, = 0. (21.77) 
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21.2.4 Green function 


The Green function is easily obtained by direct analogy to the vacuum case. 
The most elegant form is obtained using the careted notation. The photon Green 
function in a dielectric satisfies the equation 


A 1 A 3 A $ 1 
| Ew + (1 — =) bâ. D (x, x')5,25(x, x’). (21.78) 
Thus, defining the careted momentum by 
ik, = d,e"%, (21.79) 


i.e. such that ky, = (—now/c, k), one has straightforwardly that 


X ; Ge lei kk, 
D(x, x") = Gayo? | re a | (21.80) 


Note carefully which of the quantities are careted and which are not. This Green 
function relates the careted field to the careted source, 


A, (x) = [eee Byer a. (21.81) 


22 


The massive Proca field 


The massive vector field is the model which describes massive vector bosons, 
such as the W and Z particles of the electro-weak theory. 


22.1 Action and field equations 


The action for the Proca field is 
1 uv 1 2 Au u 
S= {| (dx) i Fa + am AVA, — JPA tT, (22.1) 
where 
Fav = On, Ay — Ay. (22.2) 
The variation of the action gives 
6A = fæ [—3" Fuy +m? Ay — Ju] ôAn + J do” FA. (22.3) 
This yields the field equation 
—3” Fuy +m A, = Ju, (22.4) 


also writable as 


—O1A, — 9,(8"A,) HMA, = Jy, (22.5) 


and associated continuity conditions identical to those of the Maxwell field. The 
conjugate momentum (do = do”) is 


Il; = Foi. (22.6) 
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If the surface o is taken to separate two regions of space rather than time, one 
has the continuity conditions for the Proca field in a vacuum: 


AFio = 0 
AF; = 0, (22.7) 


and we have assumed that 5A, is a continuous function. Taking the n + 1 
divergence of eqn. (22.4), we obtain 


A, =0. (22.8) 


Here we have used the anti-symmetry of F, and the assumption that the source 
is conserved, 0“ J,. Thus the field equations, in the form of eqn. (22.5), become 


(-O +m’)A, = Jy (22.9) 
a“A, =0. (22.10) 


In contrast to the electromagnetic field, this has both transverse and longitudinal 
components. 


23 
Non-Abelian fields 


Positive and negative electrical charges label the different kinds of matter that 
respond to the electromagnetic field; there is the gravitational mass, for example, 
which only seems to have positive sign and labels matter which responds to the 
gravitational field. More kinds of charge are required to label particles which 
respond to the nuclear forces. With more kinds of charge, there are many 
more possibilities for conservation than simply that the sum of all positive and 
negative charges is constant. Non-Abelian gauge theories are physical models 
analogous to electromagnetism, but with more general ideas of charge. Some 
have three kinds of charge: red, green and blue (named whimsically after the 
primary colours); other theories have more kinds with very complicated rules 
about how the different charges are conserved. This chapter is about such 
theories. 


23.1 Lie groups and algebras 


In chapter 9 it was noted that the gauge invariance of matter and electromagnetic 
radiation could be thought of as a symmetry group called U(1): the group of 
phase transformations on matter fields: 


p > ehM, (23.1) 


for some scalar function 6 (x). Since phase factors of this type commute with 
one another, 


eid), ei} =0 (23.2) 


such a symmetry group is called commutative or Abelian. The symmetry group 
was identified from the anti-symmetry properties of the curls in Maxwell ’s 
equations, but the full beauty of the symmetry only became apparent in the 
covariant formulation of field theory. 
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In the study of angular momentum in chapter 9, it was noted that the 
symmetry group of rotations in two spatial dimensions was U(1), but that in 
three spatial dimensions it was O(3). The latter is a non-Abelian group, i.e. its 
generators do not commute; instead they have a commutator which satisfies a 
relation called a Lie algebra. 

Non-Abelian gauge theories have, for the most part, been the domain of 
particle physicists trying to explain the elementary nature of the nuclear forces 
in collision experiments. In recent times, some non-Abelian field theories have 
also been used by condensed matter physicists. In the latter case, it is not 
fundamental fields which satisfy the exotic symmetry properties, but composite 
excitations in matter referred to as quasi-particles. 

The motivation for non-Abelian field theory is the existence of families of 
excitations which are related to one another by the fact that they share and 
respond to a common form of charge. Each so-called flavour of excitation is 
represented by an individual field which satisfies an equation of motion. The 
fields are grouped together so that they form the components of a column vector, 
and matrices, which multiply these vectors exact symmetry transformations on 
them — precisely analogous to the phase transformations of electromagnetism, 
but now with more components. The local form of the symmetry requires the 
existence of a non-Abelian gauge field, A,,, which is matrix-valued. 

Thus one asks the question: what happens if fields are grouped into multiplets 
(analogous to the components of angular momentum) by postulating hidden 
symmetries, based on non-Abelian groups. 

This idea was first used by Yang and Mills in 1954 to develop the isospin 
SU (2) model for the nuclear force [141]. The unfolding of the experimental 
evidence surrounding nucleons led to a series of deductions about conservation 
from observed particle lifetimes. Charge labels such as baryon number, isospin 
and strangeness were invented to give a name to these, and the supposition that 
conserved charges are associated with symmetries led to the development of 
non-Abelian symmetry models. For a summary of the particle physics, see, for 
example, refs. [34, 108]. 

Non-Abelian models have been used in condensed matter physics, where 
quasi-fields for mean-field spin systems have been formulated as field theories 
with SU(N) symmetry [1, 54]. 


23.2 Construction 


We can now extend the formalism in the remainder of this book to encompass 
non-Abelian fields. To do this, we have to treat the fields as multi-component 
vectors on the abstract internal space of the symmetry group, since the transfor- 
mations which act on the fields are now matrices. The dimension of the matrices 
which act on matter fields (Klein—Gordon or Dirac) does not have to be the same 
as those which attach to the gauge field A,, — the only requirement is that both 
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sets of matrices satisfy the same algebra. This will become clearer when we 
examine the nature of gauge transformations for non-Abelian groups. 

We begin with some notation. Let {7}, where a = 1, ...,dg, denote a set 
of matrices which acts as the generators of the simple Lie algebra for the group 
G. These matrices satisfy the Lie algebra 


[Te P= if 7 Te. (23.3) 


T°? are chosen here to be Hermitian. This makes the structure constants 
real numbers. It is also possible to find an anti-Hermitian representation by 
multiplying all of the T“ by a factor of i = ./—1, but we shall not use this 
convention here. With anti-Hermitian conventions, the Abelian limit leads to 
an imaginary electric charge which does not agree with the conventions used in 
other chapters. 

The T° are dg x dg matrices. In component form, one may therefore write 
them explicitly (T%)4g, where A,B = 1,...,dg, but normally the explicit 
components of T^ are suppressed and a matrix multiplication is understood. 
We denote the group which is obtained from these by Gr, which means the 
representation R of the group G. The normalization of the generators is fixed 
by defining 


Tr(T Tx) = b(GR)d”, (23.4) 


where J, is called the Dynkin index for the representation Gr. The Dynkin 
index may also be written 


h(Gr) = eeu) (23.5) 
dg 
where dpr is the dimension (number of rows/columns) of the generators in the 
representation G g and dg is the dimension of the group. C2 (G p) is the quadratic 
Casimir invariant for the group in the representation Gr: C2(Gpr) and In(Gr) 
are constants which are listed in tables for various representations of Lie groups. 
dg is the same as the dimension of the adjoint representation of the algebra 
G.qj, by definition of the adjoint representation. Note therefore that 72 (Gaj) = 
C2(Gagj)- 

In many texts, authors make the arbitrary choice of replacing the right hand 
side of eqn. (23.4) with 18 . This practice simplifies formulae in a small 
number of special cases, but can lead to confusion later. Also, it makes the 
identification of group constants (for arbitrary groups) impossible and leads 
therefore to expressions which are not covariant with respect to changes of 
symmetry group. 

To construct a physical theory with such an internal symmetry group we 
must look to the behaviour of the fields under a symmetry transformation. We 
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require the analogue of a gauge transformation in the Abelian case. We begin 
by assuming that the form of a symmetry transformation on matter fields is 


® > U®, (23.6) 
for some matter field © and some matrix 
U = exp (i0“(x)T"), (23.7) 


which is the element of some Lie group, with an algebra generated by T°, 
(a = 1,...,dg). Eqn. (23.6) contains an implicit matrix multiplication: the 
components are normally suppressed; if we write them explicitly, eqn. (23.6) 
has the appearance: 


p^ — USD”. (23.8) 


Since the generators do not commute with one another, and since U is a com- 
bination of these generators, T“ and U cannot commute; moreover, consecutive 
gauge transformations do not commute, 


[U, U'] 40, (23.9) 


in general. The exception to this statement is if the group element U lies in 
the centre of the group (i.e. the group’s Abelian sub-group) which is generated 
purely by the Cartan sub-algebra: 


U, = exp (i0'(x)H'),  =@=1,...,rankG) (23.10) 
0 = [U., U]. (23.11) 


Under such a transformation, the spacetime-covariant derivative is not gauge- 
covariant: 


ð (U P) A U (3, ®). (23.12) 


We must therefore follow the analogue of the procedure in chapter 10 to define a 
covariant derivative for the non-Abelian symmetry. We do this in the usual way, 
by introducing a gauge connection, or vector potential 


Ay = AL Q)T%, (23.13) 


which is a linear combination of all the generators. The basis components A‘ (x) 
are now the physical fields, which are to be varied in the action. There is 
one such field for each generator, i.e. the total number of fields is equal to the 
dimension of the group dg. In terms of this new field, we write the covariant 
derivative 


Dy = 3, + iF Ay, (23.14) 
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where g is a new charge for the non-Abelian symmetry. As in the Abelian case, 
D,, will only satisfy 
D,(U®) = U(D,®), (23.15) 


if @ and A, both transform together. We can determine the way in which A,, 
must transform by writing 


D,(U®) = (8,U)® + U(a,®) + iŽA,U® 


=i (a2 +U7!(8,U)® + unaua] (23.16) 


From this, we deduce that 


iF A), ® = iF U-'A,U® + U-'(,U)®, (23.17) 
so that the complete non-Abelian gauge transformation has the form 
pP =U l 
Ai, = UT'A,U — TaN. (23.18) 


The transformation of the field strength tensor in a non-Abelian field theory can 
be derived from its definition: 


Poshi Ar iF TAy. At (23.19) 


and has the form 
Fay > UFU. (23.20) 


Note that the field strength is not gauge-invariant: it transforms in a non-trivial 
way. This means that F’,, is not an observable in non-Abelian field theory. The 
field strength tensor can also be expressed directly in terms of the covariant 
derivative by the formula 


[D,, D,] = aT (23.21) 


or 
Pac DA= DAS FTAs. A]. (23.22) 
The field strength can also be expressed as a linear combination of the generators 


of the Lie algebra, and we define the physical components relative to a given 
basis set T° by 


Fyy = Fo TÀ. (23.23) 
Using the algebra relation (23.3), these components can be expressed in the form 


Finy = Oy Ay — WAL + Bf 5-ALAy. (23.24) 
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23.3 The action 


We are now in a position to postulate a form for the action of a non-Abelian 
gauge theory. We have no way of knowing what the ‘correct’ action for such 
a theory is (nor any way of knowing if such a theory is relevant to nature), so 
we allow ourselves to be guided by the invariant quantities which can be formed 
from the non-Abelian fields. For free scalar matter fields, it is natural to write 


Su = J (dx) {h° (DED) (D,D) +m D h, (23.25) 
where 
D, © = 4,8 +igA,® (23.26) 


which has the form of a matrix acting on a vector. Clearly, the number of 
components in the vector ® must be the same as the number of rows and 
columns in the matrix A, in order for this to make sense. The dagger symbol 
implies complex conjugation and transposition. 

For the non-Abelian Yang—Mills field the action analogous to the Maxwell 
action is Syy[A + A], where 


1 
a a L, pv 
Sym[A] = ENA CE [oon(r Fw) , (23.27) 


where una is analogous to the permeability in electromagnetism. The trace 
in eqn. (23.27) refers to the trace over implicit matrix components of the 
generators. The cyclic property of the trace ensures that this quantity is 
gauge-invariant. Under a gauge transformation, one has 


Tr (FH Fv) > Tr (UTF FaU) = Tr (Pee): (23.28) 


23.4 Equations of motion and continuity 


The Wong equations describe classical point particles coupled to a non-Abelian 
gauge field [140]: 


— — p” 
dt P 
d H 
m—— = gQ° F" p 
T 
d 
peed fE prao: (23.29) 
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23.5 Multiple representations 


The gauge field A,, appears several times in the action: both in connection with 
the covariant derivative acting on matter fields, and in connection with the Yang— 
Mills term. The dimension dg of the matrix representation used in the different 
parts of the action does not have to be equal throughout. Indeed, the number 
of components in the matter vector is chosen on ‘phenomenological’ grounds to 
match the number of particles known to exist in a multiplet. A common choice 
is: 


e the fundamental representation for matter fields, i.e. A, = At, Tř in Dy; 


e the adjoint representation for the Yang-Mills terms, i.e. A, = A% Taj in 


TrF?. common situation is to choose 


Although this is a common situation, it is not a necessity. The choice of 
representation for the matter fields should be motivated by phenomenology. In 
the classical theory, there seems to be no good reason for choosing the adjoint 
matrices for gauge fields. It is always true that the components of the field 
transform in the adjoint representation regardless of the matrices used to define 
the action. 


23.6 The adjoint representation 


One commonly held belief is that the gauge field, A,,, must be constructed from 
the generators of the adjoint representation. The components of the gauge field 
Af, transform like a vector in the adjoint representation, regardless of the matrix 
representations used to define the gauge fields in the action. This follows simply 
from the fact that A,, is a linear combination of all the generators of the algebra 
[21]. To show this, we begin by noting that, in a given representation, the 
structure constants which are identical for any matrix representation form the 
components of a matrix representation for the adjoint representation, by virtue 
of the Jacobi identity (see section 8.5.2). 

Consider an arbitrary field A, with components 6% relative to a set of basis 
generators T^ in an arbitrary representation, defined by 


A= Ta, (23.30) 


The generator matrices may be in a representation with arbitrary dimension dpr. 
Under a gauge transformation, we shall assume that the field transforms like 


A’ = UAU, (23.31) 
where U is in the same matrix representation as JT“ and may be written 


U = exp" T®). (23.32) 
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Using the matrix identity, 
1 
exp(A)B exp(—A) = B +[A, B] + zlé [A, B]] 
1 
talá, [A,[A, B]]]+---, (23.33) 
it is straightforward to show that 
1 aj şa ab I ca cbs 
eet [8 = f°" + 5660.f a pbs 
1 


3! 


where the algebra commutation relation has been used. In our notation, the 
generators of the adjoint representation may be written 


P = if” (23.35) 


(aa C2 


Oba fe SISE +... per (23.34) 


(Tij 
and the structure constants are real. Eqn. (23.34) may therefore be identified as 
A = àf Gia) T’, (23.36) 
where 
Uaj = exp(i0“T,4;). (23.37) 
If we now define the components of the transformed field by 
Mae T*, (23.38) 
in terms of the original generators, then it follows that 
A Uae, (23.39) 


We can now think of the set of components A“ and 4’“ as being grouped into dg 
component column vectors à and i’, so that 


A= Usdas (23.40) 
In matrix notation, the covariant derivative of the matrix-valued field A is 
D,A = ðA + ig[A,, Al, (23.41) 
for any representation. Using the algebra commutation relation this becomes 


DA = d,A +igAMA, (23.42) 


where Ae = A/T. We have therefore shown that the vectorial components 
of the gauge field transform according to the adjoint representation, regardless 
of the matrices which are used in the matrix form. 
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23.7 Field equations and continuity 


= t fat 1l v | 
= fæ [oro (D,®) +m oto + aoo Fuy) 


(23.43) 


The variation of the action with respect to ®t yields the equation of motion 
for ®: 


ôS = fæ [(D 8) (D,D) + m5" E] 
= fao {—D°® + m*o} 
+ J do” [8P (D, ®)}. (23.44) 
The gauge-fixing term is 


es dv, Tr( D (D, a"). (23.45) 
E | 


23.8 Commonly used generators 


It is useful to have explicit forms for the generators in the fundamental and 
adjoint representations for the two most commonly discussed groups. For 
SU(N), the matrices of the fundamental representation have dimension N. 


23.8.1 SU (2) Hermitian fundamental representation 


Here, the generators are simply one-half the Pauli matrices in the usual basis: 
1/0 1 
1 = — 
a 2 ( 1 0 ) 
1/0 —i 
2 m 
ee: 5 ( i 0 ) 


1/1 0 
35 
T =F la L (23.46) 
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E === (4 a) (23.47) 
aS ip Viki 0)? : 
where the eigenvalue œ = 1 and 
[H, Ey] = & Ex 
[Ew, E-a] = aH. (23.48) 


The diagonal components of H are the weights of the representation. 


23.8.2 SU (2) Hermitian adjoint representation 


In the adjoint representation, the generators are simply the components of the 
structure constants in the regular basis: 


00 0 
T'=| 00 -i 
0 i 0 
0 0 i 
T?=| 0 00 
—i 0 0 
0 -i 0 
T=| i 0 0 (23.49) 
0 0 0 


To find a Cartan-Weyl basis, in which the Cartan sub-algebra matrices are 
diagonal, we explicitly look for a transformation which diagonalizes one of 
the matrices. The same transformation will diagonalize the entire Cartan 
sub-algebra. Pick arbitrarily T! to diagonalize. The self-inverse matrix of 
eigenvectors for T! is easily found. It is given by 


-1 0 0 
1 —i 

ssp O 2a ai (23.50) 
OBR 


Constructing the matrices A~'T“ A, one finds a new set of generators, 


00 0 
T'=!01 0 
00 -1 
O 1 i 
T?=| 100 
0 0 


—i 
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i l 
T?=| -i 0 O |. (23.51) 
0 0 


The Cartan—Wey] basis is obtained from these by constructing the combinations 


“=, 1 


E}, = — (T? iP?) 
1 V3 F 
H=T'. (23.52) 
Explicitly, 
0 i 0 
E,;={ 0 0 0 
1 0 0 
0 0 1 
E; =| —i 0 0]. (23.53) 
0 0 0 
It may be verified that 
[H, Ea] = œ Ex (23.54) 


for a = +1. The diagonal values of H are the roots of the Lie algebra. It is 
interesting to note that the footprint of SU (2) crops up often in the generators 
of other groups. This is because SU (2) sub-groups are a basic entity where the 
roots show the simplest reflection symmetry. Since roots occur in signed pairs, 
SU (2) is associated with root pairs. 


23.8.3 SU (3) Hermitian fundamental representation 


The generators of SU (3)’s fundamental representation are the Gell-Mann ma- 
trices: 


,( 0 E 
T= -|1 0 0 
2\o 0 0 
TES 
T=-| -i00 
2\o 00 
T, 
PSN nO: E E 
2\o 00 
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1 0 0 -l 
T4 = 3 0 0 0 
-1 O 0 
ı/ 0 0i 
Ts=5{ 0 0 0 
—i 0 0 
ı/0 0 0 
0 —1 0 
ı/0 9 0 
T= 2 0 0 1 
0 —i 0 
1 -1 0 0 
Ts al 0 


E ae, 
2/3\ 9 0 2 
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(23.55) 


The generators of the Cartan sub-algebra T? and T® are already diagonal in 
this representation. Forming a matrix which is an explicit linear combination of 
these generators 0°T“, the following linear combinations are seen to parametrize 


the algebra naturally: 


i 
E~; = — (T! + iT” 
1 V2 
i 
E- = — (T + iT’) 
2 V2 


E3 = Í TET 
~ y2 
H! =T? 
H? =T°. 


These matrices satisfy the Cartan-Weyl1 relations 


| Ht, Ea] = 0'E 
[Ex, E-a] = œ'H;, 


where i is summed over the elements of the Cartan sub-algebra. 


(23.56) 


(23.57) 


This last 


relation tells us that the commutator of the generators for equal and opposite 
roots always generates an element of the centre of the group. The coefficients a’ 
are the components of the root vectors on the sub-space spanned by the Cartan 
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sub-algebra. Explicitly, 


, {0 00 
AE oh 00 
V¥2\ 0 0 0 
poe Si 
Bet 10> “WO: <0 
V¥2\0 0 0 
(OO 
Pe 0 06 
V¥2\ 100 
ae Ome 
ee er ees ae 
V2\0 0 0 
Oo ae O 
pes) 6.08 0 
V2\ 0 -1 0 
, (90 0 
Ess AN0 ee 
Z2\oo0o 0 


(23.58) 


Constructing all of the opposite combinations in the second relation of 


eqn. (23.57), one finds the root vectors in the Cartan-Wey1 basis, 


a4; = (1, 0) 
e=) 
A442 = roa 
2 2 
aama (48) 
2 2 


23.8.4 SU (3) Hermitian adjoint representation 


(23.59) 


The generators in the adjoint representation are obtained from the observation in 
eqn. (23.35) that the structure constants form a representation of the Lie algebra 


with the same dimension as the group: 


(T9), = if, 


(23.60) 
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., 8. The structure constants are 


where a,b,c =1,.. 


1 


fiz 


(23.61) 


together with anti-symmetric permutations. In explicit form, we have 
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T? =i (23.62) 


coocoocooeoscse 
cooocoooscs 
OOO OO OOO 
oo oho ooo 
A 
Ww 
Oyo oaodcse 
Gi ghee SS 
Ww 
cococooosce 


The anti-Hermitian form of these matrices is obtained by dropping the leading 
factor of i. The Cartan—Weyl basis for the adjoint representation is obtained 
by diagonalizing one (and thereby several) of the generators. We choose to 
diagonalize T8 because of its simple form. This matrix has four zero eigenvalues 
representing an invariant sub-space, so eigenvectors must be constructed for 
these manually. A set of normalized eigenvectors can be formed into a matrix 
which will diagonalize the generators of the Cartan sub-algebra: 


i it 
ee Oo eo 
Be 8 OO OF 20° 0.0 
0 0 -1 0 0 0 0 0 
0 0 0 `- 4 0 0 0 
Be eee ee ae ea (23.63) 
ava 
D7, Oe oe Se 
000004 40 
0 0 0 0 0 0 0 1 


The inverse of this is simply the complex conjugate. The new basis is now 
constructed by forming A~'T“A: 


me g es 

ae ne ee a ae 

-4 -% 0 0000 0 

r-| 0 0 0 00 £0 0 
OF W ono eo 

o 0o 0 4} 0: 20) 10) 20 

0- © 20.2%. 10.70.50 

o 0o 0 00000 
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The Cartan—Wey]1 basis is now obtained by constructing the linear combinations 


(23.65) 


23 Non-Abelian fields 


484 


Explicitly, 
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1 
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E3=| x 23.66 
i 0 0 00000 0 R 
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o 0000000 
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These generators satisfy the relations in eqns. (23.57) and define the components 
of the root vectors in two ways. The diagonal components of the generators 
spanning the Cartan sub-algebra are the components of the root vectors. We 
define 


H =T 
H = Tg. (23.67) 


The commutators in eqns. (23.57) may now be calculated, and one identifies 


æ4ı = F(1, 0) 
1 v3 
O42 = F (5 $) 


+3 = F (-; $) ; (23.68) 


RQ 
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Chern—Simons theories 


In 2 + 1 dimensions it is possible to construct actions which include a ‘topolog- 
ical’ interaction called the Chern—Simons term. The Chern—Simons term takes 
the form 


dx) 1 
Scs = J (dx) =e" A, An 
Vg 2 
1 
= ae shel Adv An, (24.1) 


for Abelian theories, and the extended form, 
dV, kg? 
V8 2ħ° Ca (Gaa 


pvr 2 
Scs-NA = i Tr Adv A, — 13 ApAvAy J 


(24.2) 


in the non-Abelian case, where Hermitian generators are used. This action is 
real, as may be seen by applying the Lie algebra relation in eqn. (23.3). The 
effect of the Chern-Simons term on the dynamics of a field theory depends 
on whether the Maxwell or Yang—Mills term is also present. Since the 
Chern-Simons term is purely linear in all derivatives, and there are no additional 
constraints, as in the Dirac equation, it does not carry any independent dynamics 
of its own. 

In the absence of dynamics from a Maxwell or Yang—Mills-like contribution 
to the action, the effect of this term is to induce a duality of variables, i.e. an 
equivalence relation between F, and J”. Coupled together with a Maxwell or 
Yang—Mills term, the Chern—Simons term endows the vector potential with a 
gauge-invariant mass [35, 36, 64, 110]. 

An unusual but important feature of the Chern—Simons action is that it is 
independent of the spacetime metric. Since the Levi-Cevita tensor transforms 
like a tensor density, a factor of ./g is therefore required to cancel the one 
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already in the volume element. This has obvious implications for the usefulness 
of the variational definition of energy and momentum, by 7,,). 


24.1 Parity- and time reversal invariance 


The Chern—Simons terms violates parity- and time reversal invariance, both 
of which are defined for 2 + 1 dimensions as the reflection in the axis of a 
single coordinate [64]. This is clearly not a fundamental property of nature. 
It is therefore only expected to play a role in physical systems where such 
a breakdown of parity invariance is present by virtue of special physical 
conditions. There are several such situations. Ferromagnetic states of spin fields, 
in the Hall effect, strong magnetic fields and vortices are examples [54]. 

The presence of a Chern—Simons term in the action of a field theory would 
lead to a rotation of the plane of polarization of radiation passing through a 
two-dimensional system, as in the Faraday effect (see sections A.6.1 and 7.3.3 
and refs. [14, 16]). In ref. [24], the authors use the formalism of parity-violating 
terms to set limits on parity-violation from astronomical observations of distant 
galaxies. Spin polarized systems can be made into junctions, where Chern- 
Simons coefficients can appear with variable strength and sign [11, 14]. 


24.2 Gauge invariance 


The transformation of the Chern—Simons action under gauge transformations, 
with its independence of the metric tensor, is what leads to its being referred to 
as a topological term. Consider the transformation of the non-Abelian action 
under a gauge transformation 


h 
A, > UA,U~' —i-(,U)U7; (24.3) 
g 
it transforms to 
S— S+ fæ (3 V”) 


— ” ç pmi = L1 4 
+ 6Cx(Gaa)” Tr f aw [U(0,U )U(0,U~')U(0,U )]. 


(24.4) 
The second term in the transformed expression is a total derivative and therefore 


vanishes, provided U(co) = U (0): for instance, if U — 1 in both cases (this 
effectively compactifies the spacetime to a sphere). The remaining term is: 


ħk 
—  ” pm zj 1 B 
ôS = 6Cx(Gaa)® Tr faw [U(0,U~')U(8,U~')U(0,U") | 


= 8hn7k W (U), (24.5) 
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where W(U) is the winding number of the mapping of the spacetime into the 
group, which is determined by the cohomology of the spacetime manifold. It 
takes integer values n. Clearly the action is not invariant under large gauge 
transformations. However, if k is quantized, such that 47rk is an integer, 
the action only changes by an integral multiple of 27, leaving the phase 
exp(iS/f + 27in) invariant. This quantization condition has been discussed 
in detail by a number of authors [35, 36, 39, 40, 43]. 


24.3 Abelian pure Chern—Simons theory 


The Abelian Chern—Simons theory is relatively simple and has been used mainly 
in connection with studies of fractional statistics and the quantum Hall effect, 
where it gives rise to ‘anyons’ [4]. 


24.3.1 Field equations and continuity 


Pure Chern—Simons theory is described by the Chern—Simons action together 
with a gauged matter action. In the literature, Chern—Simons theory is usually 
analysed by coupling it only to some unspecified gauge-invariant source: 


1 
S= J (3u Andra + JAn) . (24.6) 
The variation of the action is given by 
1 
6S = [ee (—ne™3A, 0A; + J“ôA,}— J dorz HApôAn, (24.7) 
implying that the field equations are 
l và 
zee" faa", (24.8) 
with associated boundary (continuity) conditions 
1 
A x HeuorAy = 0, (24.9) 


where the boundary of interest points in the direction of x°. Notice that, whereas 
the field equations are gauge-invariant, the boundary conditions are not. The 
physical interpretation of this result requires a specific context. 
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24.4 Maxwell—Chern-Simons theory 
24.4.1 Field equations and continuity 


In the literature, Chern—Simons theory is usually analysed by coupling it only to 
some unspecified gauge-invariant current: 


1 1 
S= J (r Fa = 5 He A ðA + Jy) : (24.10) 


The same warnings about the generality of this notation apply as for pure Chern- 
Simons theory. The field equations are now given by 
1 1 A 
— 3" Fuy + ue” Fa = J", (24.11) 
Ho 2 
with associated boundary conditions 


1 
A (=u Fa + Zuena") =0. (24.12) 


24.4.2 Topological mass 


To see that the derivative terms of the Chern—Simons action lead to a gauge- 
invariant massive mode, one may perform a diagonalization to the eigenbasis of 
the action operator: 


1 
S= 3 [car Argh + neht 
= faa On A”. (24.13) 


In a flat space, Cartesian coordinate basis, where all derivatives commute, this 
is seen most easily by writing the components in matrix form: 


= ud — 104 
OX = | -por — ud l. (24.14) 


nð! puð — 


The determinant of this basis-independent operator is the product of its eigen- 
values, which is the product of dispersion constraints. Noting that — = 
—05 + 3? + 03, it is straightforward to show that 


det O = (-O)?(-O + u’), (24.15) 


showing that the dispersion of the Maxwell—Chern—Simons field contains two 
massless modes and one mode of mass u?. This massive theory has been studied 
in refs. [35, 36, 64, 110]. 
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24.4.3 Energy—momentum tensors 


In Chern—Simons theory, the energy-momentum tensors 6,,, and 7), do not 
agree. The reason for this is that the Chern—Simons action is independent of 
the metric tensor (it involves only anti-symmetric symbols), thus the variational 
definition of T„, inevitably leads to a zero value. If we use eqn. (11.68) and 
assume the gauge-invariant variation in eqn. (4.81), we obtain the following 
contributions to 6,,, from the action in eqn. (24.1), 


6 ee EPI F, SEA F” (24.16) 
uv Feny pron 2 mw *p* co g 


The fact that these two tensors do not agree can be attributed to the failure of 
the variational definition of T„, in eqn. (11.79). Since the Chern—Simons term 
is independent of the spacetime metric, it cannot be used as a generator for the 
conformal symmetry. 

The contribution in eqn. (24.16) is not symmetrical but, in using the Bianchi 
identity «“"°0,, Fp, it is seen to be gauge-invariant, provided the Chern—Simons 
coefficient is a constant [11, 12, 14]. 


24.5 Euclidean formulation 


In its Wick-rotated, Hermitian form, the Chern—Simons action acquires a factor 
ofi = /—1, unlike most other action terms, since the Levi-Cevita tensor does 
not transform under Wick rotation. It has the Abelian form 


dx) 1 
Scs_p =i J P He Ayo As, (24.17) 


and the non-Abelian form, for Hermitian generators 


_NA~ i eh“ Tr ý i j : 
CS—NA-E Je GGG) pOvA, 3H à 


(24.18) 


25 


Gravity as a field theory 


This chapter provides the briefest, tangential encounter with the Einsteinian 
gravity viewed as a field theory. Gravity is a huge topic, full of subtleties, and 
it deserves to be introduced as a systematic tower of thought, rather than as a 
gallery of sketchy assertions. The purpose of this chapter is therefore no more 
than to indicate, to those who already know the general theory of relativity, how 
gravity fits into the foregoing discussions, i.e. why the foregoing ideas are still 
valid in the presence of gravity, and how we generalize our notion of covariance 
to include the gravitational force. 


25.1 Newtonian gravity 


Newtonian gravity plays virtually no role in field theory, for the simple reason 
that gravity barely couples to any of the fields. Gravity is such a weak force at 
the scale of elementary particles that it is almost completely negligible. There 
are occasions, however, when we use field theory outside of the realm of the 
elementary physics. For instance, fluid dynamics is a field theory where gravity 
plays an often significant role. 

In order to include gravity in terrestrial systems, we do not need to think about 
Einstein or relativity. Gravity is simply an effective potential 


V = mgx + const., (25.1) 


where x is the height above the centre of gravity. In this effective theory of 
gravity, planets and large objects are considered to be point particles, located 
at the centre of gravity of the system. Eqn. (25.1) expresses a linear, flat-Earth 
geometry, in which the potential is usually measured from the ground up (for 
small distances of a few hundred metres). The arbitrary constant in the potential 
is analogous to the arbitrariness in the electromagnetic potential A„. Instead 
of gauge invariance, we have a corresponding arbitrariness in the origin of the 
gravitational potential. 
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25.2 Curvature 


On astrophysical scales, gravity is the dominant force, and we need to consider 
the subtleties of general relativity. There are two motivations for wanting to do 
this 


e Einsteinian gravity can be formulated as a field theory, in which the metric 
tensor (spacetime itself) is also a dynamical field. This enables us to 
understand gravity and spacetime as a dynamical system, leaning on all 
of the lessons we have learned from electromagnetism etc. 


e In the early universe, there was an important coupling between gravity 
and other fundamental fields. Thus, relativistic, covariant formulations of 
fields which include gravity are important models to consider. 


Gravity therefore means Einsteinian gravity here, and this, we know, has a nat- 
ural expression in terms of the intrinsic curvature of spacetime. For the reasons 
discussed in the previous section, it makes no sense to look at non-relativistic 
theories in the presence of a relativistically generalized gravitational potential; 
such combinations would not be consistently compatible. We therefore dispense 
with the non-relativistic theories for the remainder of this short chapter. 


25.3 Particles in a gravitational field 


The essence of general relativity is that gravitational effects can be considered 
as physics in non-inertial frames. A non-inertial frame is a coordinate basis 
which is either accelerating or which contains a gravitational field. These two 
situations are indistinguishable, according to the equivalence principle, and so 
this is a kind of tautology. Indeed, we could go on to refer to the gravitational 
field as an acceleration field. 

How shall we describe physics in such frames? Non-linear coordinate 
transformations can always map us from a locally inertial frame,’ so covariance 
will help us to formulate theories optimally. The discussion which follows is 
based on the conventions and notations of Weinberg [133]. Readers who are 
unfamiliar with gravity could do worse than to consult his book, since there is 
no room for more than a cursory sketch here. 

Let us denote the coordinates and derivatives and metric in a locally inertial 


E 
Cartesian frame by £“, ð n, Nuv, and the corresponding quantities in any other 
coordinate system (flat, curvilinear, curved, accelerating etc.) by x“, dy, guv- 
The transformation which relates the two metrics is written according to the 


' Suppose you are in a fighter plane and are suffering from the effects of strong acceleration G 
forces: to transform to a locally inertial frame, simply press the ejector seat button and you 
will soon be in a freely falling coordinate system. 
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usual tensor rules, 
ap v ay B 
n” = gh LiL 
= gt” (3 E”) (IEF). (25.2) 


In its locally inertial, or freely falling, coordinate frame, a moving particle seems 
to be following a straight-line path (although, since the frame is only inertial 
locally, we should not extrapolate too far from our position of observation). The 
equation of motion of such a particle would then be 
ee 0 25.3 
ae oe) 


where the proper time t is defined in the usual way by 
—c7dt? = nagdé*dé?. (25.4) 


Suppose now we transform into a general set of coordinates, using the Lorentz 
transformation L ie . We then have to transform £“, so that eqn. (25.3) becomes 


d /dé* dé” (x) dx” 
a (EWN a (4) de") Bens 
dt dt dr \ dx” drt 
Thus, the equation of motion becomes 
d’x” dx” dx” 
O 3DE”) — = 25.6 
(Bn8)— > + (Ond8) (25.6) 


E 
This can be simplified by multiplying through by 3x xô and using the chain-rule 
E 
(dq x*)(0,€") = 88 to give 


dx” a dx“ dx” 


PS =U, 25.7 
dr2 HV dt dt ( ) 
which is the geodesic equation, where 
À a $ À 
Fiv = (3 ðE )(ða x”). (25.8) 


The presence of the affine connection Di signals the non-linear nature of the 
coordinates. The connection may also be expressed in terms of the metric tensor 
as 


1 
ru a 58" {Suv F On8rv = dogu] : (25.9) 
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25.4 Geodesics 


The geodesic equation can also be understood in a different way, from the action 
principle. The geodesic equation is, in a sense, the structure of empty space, so 
what if we take an empty action in a locally inertial rest frame of a general 
curved spacetime and vary it with respect to different paths, as follows: 


x” — x" (à) + dx" (A)? (25.10) 


The action would then be 
S = a | ar, (25.11) 


where T is the proper time, defined in eqn. (3.38) and a is a constant with the 
dimensions of energy. Writing this in general coordinates, we have 


S= a | Jena, (25.12) 


or — introducing a parameter À, 


S= fo n = fay go n (25.13) 
= da ear aa 


This equation can now be varied with respect to x” to obtain the path of ‘least 
action’ in the coordinate system x. We already know that, in a locally inertial 
frame, the path of an object would be a straight line, and in a rest frame there 
is no motion. So the question is: how does this look to a different observer in 
possibly accelerating coordinates? The variation of the action is 


1 dà dx! dx” dôx” dx” 
as =a f ag E fogo S koipe e =o. (25.14) 


2 dt dà da dà dà 
Since we are looking at a coordinate variation, we have 


ôg uv = (guv) ox (25.15) 


dadt 


see eqn. (4.85). Thus, integrating by parts and writing dà È ^ as dd Fr? 


dx” dx” dx? = 
6S = a = f drf Orsu) — d- dt = Oow) dr £ 


d?x” 
oe =0. (25.16) 


Here we have assumed that the surface term 


dx” 
A (os, =0 (25.17) 
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vanishes for continuity. From eqn. (25.9), the result may be identified as 


dx! dx” dx? 
5S = af l-r E = = Fs} sands" dr =0. (25.18) 


We have used the symmetry on the lower indices of Diy Thus we end up with 
the geodesic equation once again: 


Ces p; dx” x” 
=0. 25.19 
dt? KY dr dt ( ) 


25.5 Curvature 


The curvature of a vector field £7 may be defined by the commutator of covariant 
derivatives, just as in the case of the electromagnetic field (see eqn. (10.45)). 
This defines a process of parallel transport of vectors and a tensor known as the 
Riemann curvature tensor: 


[Vu Vol” = —R*, &- (25.20) 


ouv 
Also analogous to electromagnetism is the expression of the curvature as a 
covariant curl: 
a a a 
Rye = Viel gy NOL pice (25.21) 


This may be compared with eqn. (2.24). The Riemann tensor has the following 
symmetry properties: 


Ripe = Ricau (25.22) 
Ripve = ~Ryrve = Raucv = Rurev (25.23) 
Rzuve + Ricuv + Raven = O. (25.24) 
The Ricci tensor is defined as the contraction 
Rie = Rigi” Se (25.25) 
and satisfies 
Ruy = Rop. (25.26) 
The scalar curvature is the total contraction 
R = R" (25.27) 
The curvature satisfies Bianchi identities, just like the electromagnetic field: 
Vo Ryuve + VieRinov + Vv Ripe = 0. (25.28) 


Contracting with g^ gives 


1 
Vu [Re = zer] =0. (25.29) 
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25.6 The action 


The action for matter coupled to gravity is written 


S = Sm + So, (25.30) 
where 
ct 
SG = — EF fæ [R — 2A]; (25.31) 
(dx) = dtd"x,/g and g = —detg,,. Sm is the action for matter fields. These 


act as the source of the gravitational field, i.e. they carry gravitational charge 
(mass/energy). 

A is the cosmological constant, which is usually set to zero. The variation of 
the action with respect to the metric is 


1 
5/8 = —5 v88” 8!” 
ER = ô (g Ruo) 
= dg!” Ruy. (25.32) 


Thus, 


ct 1 
8S = ——— | (dx) | —=2,,,[R —2A/c7] + Ru | 8g” 
az foo] 58 2A /c2] + ro] os 


ô SM 
bgt 


+—™ soe" = 0. (25.33) 


The last term is the conformal energy-momentum tensor 


1 A 
Riv 7 R8uv + o2 8H = = Thy: (25.34) 
This is Einstein’s field equation for gravity. It is, of course, supplemented by the 
field equations for matter to complete the dynamical system. Notice that matter 
and energy (the energy-momentum tensor) is the source of gravitation. Matter, 
in other words, carries the gravitational charge: mass/energy. 
The solution of these field equations is non-trivial and beyond the scope of 
this book. 


25.7 Kaluza—Klein theory 


Following Maxwell’s treatise on the electromagnetic field, Theodore Kaluza 
was amongst the first to propose a scheme for unifying the forces of nature 
using a classical field theory, based in Einstein’s equations. Kaluza’s paper, 


25.7 Kaluza—Klein theory 497 


communicated to Einstein, endured a long delay before its publication in 1921. 
His main idea, later refined by Oskar Klein, made the bold assertion that, if one 
postulated the existence of extra dimensions, then both of the known forces of 
nature (electromagnetism and gravity) could be unified, using Einstein’s idea of 
spacetime curvature. In Kaluza—Klein theory, the line element is assumed to 
have the usual form 

ds? = 8p dx"dx’ (25.35) 


where the careted indices run from 0,...,5 and x“ = (ct, x!,x?, x°, y) = 
(x#, y). Uncareted indices represent the usual 3 + 1 dimensional vectors of 
general relativity. In order to account for the U(1) symmetry, Klein proposed 
that the extra dimension should have the topology of a circle, with length L. The 
electromagnetic field plays the role of a vector field on the 3 + 1 dimensional 
spacetime, seen as the projection of the curvature of the extra dimension: 


ds? = Sis dxdx’ 
= guv dx"dx” + (dy + KA, (x)dx")’, (25.36) 


where x is a constant. Covariance in the extra dimension determines the 
transformation rule for A,, under coordinate transformations y’ = 0 (y, x“): 
, 90 
dy = 5 dy + 0,0 dx". (25.37) 
y 


For consistency with eqn. (25.36), one requires 06/dy = 1, so that under a 
change of y only, 


dy + xA dx” — dy’ + k A',dx“ 
= (dy + 0,0dx") + k A'dx” 
= dy + « (Al, Œœ) + «71 3,0) dx”. (25.38) 
Invariance of ds? therefore requires 
Al (x) = Ay(x) — K730, (25.39) 


which is the electromagnetic gauge transformation. From the line element, the 
metric is 


~ _ ( SuvtKkKAyAy KA, \. 
Bib = ( KÄ, 1 5 (25.40) 


however, by changing coordinates to the so-called horizontal lift basis, with 
1-forms: 


©” = dx" 


@ = dy +KA,(x)dx", (25.41) 
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the metric may be diagonalized, at the expense of non-Cartesian coordinates: 


A7 v 0 
= ( oa ) (25.42) 
The basis vectors conjugate to the 1-forms are @é; = 8 Le. 


ên = ðu — KAy)0, 
Es = ay. (25.43) 
In this anholonomic basis, there is one non-zero commutator: 
[én, ê] = —K Fuv(x) dy, (25.44) 
where A,,, = 0,A, — 0,A,,, which gives the Lie algebra relation 
lên, êa] = Cy? 85. (25.45) 


The affine connection, in a non-holonomic basis, is 


1 >, x X 
Tiia > 2 [ê Euv + ev Sua T eu 8av + Cava + Curv H Cavu] ’ (25.46) 
so that we have non-zero components 
X à R 1 
Pws = Pusv = Tsu = 5k Fw 
Psss=0 , Pun =T u.: (25.47) 
From these, one may calculate the scalar curvature for the Einstein action, 
D _ pw PUS 
R= R” v +2R u5 
2 
= R+ a Fu. (25.48) 


Thus, the Einstein action, in five dimensions, automatically incorporates and 
extrapolates the Maxwell action: 


ct A TA 
S=- fa tdyyg E = 241| (25.49) 


Kaluza—Klein theory came into trouble when it attempted to incorporate the 
newly discovered nuclear forces in a common framework, and was eventually 
abandoned in its original form. However, the essence of Kaluza—Klein theory 
lives on, in a more sophisticated guise, in super-string theory. 
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Appendix A 


Useful formulae 


A.1 The delta function 


The Dirac delta function is a bi-local distribution defined by the relations 


~»_f0 t-#0 (A.1) 
scr) =| t-r =O (A.2) 
(x, x") = 6(x — x’) (A.3) 
+a 
J dx’ 5(x, x") f(x) = fœ) (A.4) 
ae 
J dx’ 8(x — x) = 1. (A.5) 


If f(x) is a function which is symmetrical about xo, then 


J T + / SQo—x)F(x)dx’ T (A6) 


x0 


thus, by symmetry, 
R / / / 1 
J| So- FEA = 5 Fe. (A-7) 
=00 


A useful, integral representation of the delta function is given by the Fourier 
integral 
dk . ; 
ôx —x}) = J See (A.8) 
27 
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Various integral representations of the delta function are useful. For instance 


1 
8(x) = lim ——e-2*’, (A.9) 
a0 27a 


The Fourier representation on the (n + 1) dimensional delta function 


d'tlz és , 
(x, x) = ô(x — x") = opal (x—x’) 


= (x9 — x”) (x! — x") ...8@" — x") (A.10) 
in particular is used in solving for Green functions. Here the shorthand notation 
k(x — x’) in the exponential stands for k, (x — x’). 

Derivatives of the delta function normally refer to derivatives of the test 
functions which they multiply. Meaning may be assigned to these as follows. 


Consider the boundary value of a function f(x). From the property of the delta 
function, 


J d(x —a) f(x —a)dx = f (0). (A.11) 


Now, differentiating with respect to a, 


4 6@) = J ES —a)f(x—a)+ô(x aape -«)| dx 
da da da 


= 0. (A.12) 
From this, we discover that 
TA | ee ee ee eee (A.13) 
da da 
or 
f(t) 38l) = —d(t) 0, f(t), (A.14) 


which effectively defines the derivative of the delta function. 
A useful relation for the one-dimensional delta function of a function g(x) 
with several roots satisfying g(x;) = 0 is: 


1 
88a) =D jie, (A.15) 


where x; are the roots of the function g(x) and the prime denotes the derivative 
with respect to x. This is easily proven by change of variable. As with all 
delta-function relations, this is only strictly valid under the integral sign. Given 


I= f afea, (A.16) 
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change variables to x’ = g(x). This incurs a Jacobian in the measure |J| = 
-1 
ax _ dx _. f[ dg(x) 
ax’ ~~ dg(x) ( T ) So, 
1 1 —1 1 
I= | dt —— (PO). (A.17) 
8'(x') 


In replacing x by g~!, we satisfy the rules of the change of variable, but the 


inverse function g~!(x’) is not usually known. Fortunately, the singular nature 
of the delta function simplifies the calculation, since it implies that contributions 
can only come from the roots of g(x’), thus, the expression becomes, 


I= | rano (A.18) 
8 (xi) 

In summary, one may use this eqn. (A.15) under the integral sign generally, 

thanks to the extremely singular nature of the delta function, provided all 

multiplying functions in the integrand are evaluated at the roots of the original 

function g(x). 


A.2 The step function 


1 t-t >0 
Ott = i tsr (A.19) 
0 t—?t <0. 


An integral representation of these may be expressed in two equivalent forms: 


© day ev ia(t—1’) 
A(t —t') =i — 
( ) 2m at+ie 


© dy emiatt’) 
ot —t) = -i f — —* (A.20) 
co 2H a—ieE 


where the limit € — 0 is understood. The derivative of the step function is a 
delta function, 


0,0(t — t) = 6(t — t’). (A.21) 


A.3 Anti-symmetry and the Jacobi identity 


The commutator (or indeed any anti-symmetrical quantity) has the purely 
algebraic property that: 


[A, [B, C]] + [B,[C, A]] + [C, [A, B]] = 0. (A.22) 
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A.4 Anti-symmetric tensors in Euclidean space 


Anti-symmetric tensors arise in many situations in field theory. In most cases, 
we shall only be interested in the two-, three- and four-dimensional tensors, 
defined respectively by 


+1 ij =12 

ej= 4-1 y=21 (A.23) 
0 otherwise, 
+1 ijk = 123 and even permutations 

Eijk = 4 —1 ijk = 321 and other odd permutations (A.24) 
0 otherwise, 


+1 ijkl = 1234 and even permutations 
€ijk) = | —1 ijkl = 1243 and other odd permutations (A.25) 
0 otherwise. 


There are as many values for the indices as there are indices on the tensors in 
the above relations. Because of the anti-symmetric properties, the following 
relations are also true. 


Eij = = —€ji 
Eijk = Ekij = Ejki = —€Ekji = Eikj = —€Ejik 
Eii = 0 
€Eiij =0 
Eiijk = 0. (A.26) 


The number of different permutations increases as the factorial of the number 
of indices on the tensor. The different permutations can easily be generated by 
computing the determinant 


(A.27) 


a.s. sS S. 
a a a 
~~ OO 


as a mnemonic, but the signs will not automatically distinguish even and odd 
permutations, so this is not a practical procedure. 

Contractions of indices on anti-symmetric objects are straightforward to work 
out. The simplest of these are trivial to verify: 


eeu = 51,54, — 51/5", 

eej = =ô, 

een; = ô 

cej =2. (A.28) 
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More general contractions can be calculated by expressing the anti-symmetric 
tensor products as combinations of delta functions with varying signs and 
permutations of indices. These are most easily expressed using a notational 
shorthand for anti-symmetrization. Embedded square brackets are used to 
denote the anti-symmetrization over a set of indices. For example, 


1 
XtaYp) = zy Xab — XbYa), (A.29) 


1 
XtaYoZq = zy (XalnZe + XcYaZp + Xp¥eLa 
—XcYp Za = XaYcZp = XbYa Ze), (A.30) 
and higher generalizations. 


Consider then the product of two three-dimensional Levi-Cevita symbols. It 
may be proven on the grounds of symmetry alone that 


CT emn = B18 88 a, (A.31) 
where 
518488) = 5 (51,54,5%, + 51,5105, + 51,,6/,54 
— 35! 64, 5% — 8164 6k, — 8! si d® ), (A.32) 
Contracting this on one index (setting i = l), and writing the outermost 


permutation explicitly, we have 


[m" n] m“ [i~ n] n“ [m" i] 


ek ein = 2! (51,31 5A) — 85:54 — 88 in ). (A.33) 
Summing over i gives 


Ee = 2183 —1—1)84 8 


[mn~ n] 
= ô) 85, — 6/6 (A.34) 


n= m” 


It is not difficult to see that this procedure may be repeated for n-dimensional 
products, 


ijk i sj k 
Co Eim. n = MNS Sy oO ny: (A.35) 
Again, setting i = / and expanding the outermost permutation gives, 


eure. = (n = DORA ei On = ) 


= (n — D18; — (n — 1) 8% «8%. (A.36) 
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Since ôt, = n, the first bracket in the result above always reduces to unity. We 
may also write this result in a more general way, for the contraction of two 
p-index anti-symmetric products in n dimensions: 

p!i 64, ...8) = (p— Dia p + lS} _ O89: (A.37) 
This formula leads to a number of frequently used results: 


ijk i sk i sk 
CP Eimn = lp — Ol e n 


HE e ip =3 n p) 

eH eijnp = 21n — 2) n8 p 
= k gl k ol 
= 28% 8, — 5° 8',) 


ceig = 2(4? = 4) = 24. (A.38) 


E 


A.5 Anti-symmetric tensors in Minkowski spacetime 


In Minkowski spacetime, we have to distinguish between up and down indices. 
It is normal to define 
+1 pv=01 
e -1 pv=10 (A.39) 
0 otherwise, 


+1 uvi = 012 and even permutations 
e> =} —1  wva = 210 and other odd permutations (A.40) 
0 otherwise, 


+1 jyvap = 0123 and even permutations 
eM’? — } —] puvàp = 0132 and other odd permutations (A.41) 
0 otherwise. 


Indices are raised and lowered using the metric for the appropriate dimensional 
spacetime. Since the zeroth component always incurs a minus sign, 


Ce me ge g g g ecuri 
EU = —1.1.1.1.¢€0123, (A.42) 


one has all of the above definitions with indices lowered on the left hand side 
and minus signs changed on the right hand side. This also means that all of the 
contraction formulae incur an additional minus sign. This formula leads to a 
number of frequently used results: 


uvà = v oid v gr 
Ce eup = —8" 8 y + 548", 
À À 
e” enore = —318 1,818 
uvo = —2 (8% 6° — 8^ 8P 
€ Euvot = ( o~ Tt f o) 


ce) = —24. (A.43) 
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A.6 Doubly complex numbers 


Complex numbers z = x + iy and the conjugates z* = x — iy are vectors in 
the Argand plane. They form a complete covering of the two-dimensional space 
and, because of de Moivre’s theorem, 


e? = cosé + ising, (A.44) 


they are particularly suited to problems where rotation or circular symmetry 
is expected. But what of problems where rotation occurs in two separate, 
orthogonal planes? It seems logical to suppose that a complex representation 
of such rotation could be applied to each orthogonal plane individually. But 
such a description would require two separate kinds of vectors x + iy for one 
plane and x + jz for the orthogonal plane, where i = ./—1 and j = /—1. We 
must treat these two imaginary numbers as independent vectors, such that 


?=-l 
jy=-l 
ij 4-1. (A.45) 


Using these quantities, we can formulate doubly complex numbers 


w=x+iy—jz 
W = X +iY -jZ (A.46) 


as an alternative representation to the three-dimensional vectors w = xi + 
yj + zk. The final line in eqn. (A.45) above leads to an interesting question. 
What commutation properties should we assign to these objects? There are two 
possibilities: 


ij = tii. (A.47) 


Interestingly, these two signs correspond to the representations of two different 
groups. When i and j commute, the w form a representation of the group U (1) x 
U (1) which corresponds to independent rotations about two orthogonal axes z 
and y, but no rotation about the third axis x, in a three-dimensional space. This 
result is, in fact, trivial from de Moivre’s theorem. 

When i and j anti-commute, the w form a representation of SU (2), the group 
of three-dimensional rotations. To show this we must introduce some notation 
for complex conjugation with respect to the i and j parts. Let us denote 


w =x —iy —jz 
w=xtiytjz 
w =x —iy + jz, (A.48) 
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which have the following algebraic products: 


1 


w w =x? + yz — 2ijyz 
ww = x? + y? + 2? + 2ijyz. (A.49) 


Thus, the length of a vector is 
1 ij ij 
zW w+ win) =P ty +2 = ww, (A.50) 


and the scalar product is 


iJ 


1 ij ji ij 
weW=i (wit Wri + hw), (A.51) 


It is interesting, and significant, that — concealed within these products are the 
vector and scalar products for Euclidean space. If we assume that i and j anti- 
commute, we have 


Go, W) =bW SG 4 pV 427) EE O 
—j(zX —xZ) —i(yZ — zY) 
= (w: W)1+ (wx W), (A.52) 


where we have identified the complex numbers with Euclidean unit vectors as 
follows: 


1 < scalars 


iok 
je -j 
ij e ~i. (A.53) 


When the coupling between planes is unimportant, i and j commute and the 
power of this algebraic tool is maximal. An application of this method is given 
in section A.6.1. 


A.6.1 Refraction in a magnetized medium 


The addition of a magnetic field leads to the interesting phenomenon of plane 
wave rotation, studied in section 7.3.3. Neglecting attenuation, y = 0, the 
forcing term can be written in the form of a general Lorentz force 


g E+B (A.54) 
m—- + ks = — — x ; . 
dt? Í dt 
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or writing out the components and defining wọ = k/m, B = B,, 


ds e ds e 

X Spe 25, = -—Ey, A.55 
ae oe ae m oy 
d’s, e ds, 5 e 
ab ne ai + wpsy = ~~ Ey. (A.56) 


These two equations may be combined into a single equation by defining 
complex coordinates s = sy +is, and E = Ey +iE,, provided wp is an isotropic 
spring constant, 1.e. Wp, = Woy 


ds e 
— —i—B— + os = —-—E. (A.57) 
m m 


Plane polarized waves enter the medium, initially with their E vector parallel 
to the x axis. These waves impinge upon the quasi-elastically bound electrons, 
forcing the motion 


s = Re [soe OP) 5 (A.58) 
E = Re | Eod], (A.59) 
where j? = —1, but ij 4 —1. We use j as a vector, orthogonal to i and to the real 


line. Re is the real part with respect to the j complex part of a complex number. 
Substituting for E and s, we obtain 


B . 
-e + je = + o| so= -Ê Feit. (A.60) 
m m 


The amplitudes so and Ep are purely real both in i and j. Comparing real and 
imaginary parts in j, we obtain two equations: 


(w2 — w*)s9 = -£ Epcos 4, (A.61) 
m 


_eBo 
i 


59 = ——Epsing. (A.62) 
m 


The phase ¢ is i complex, and this leads to rotation of the polarization plane — 
but this is not the best way to proceed. We shall show below that it is enough 
that the wavevector k be an i complex number to have rotation of the polarization 
plane vector E. To find k, we must find the dispersion relation for waves in a 
magnetized dielectric. It is assumed that the resonant frequency of the system 


510 Appendix A 


is greater than the frequency of the electromagnetic waves (wọ > œ). For low- 
energy radiation, this is reasonable. Defining the usual relations 


P= — pyes 
D=P + 6E 
B = uH 
1 
CaS, (A.63) 
€oHo 
from Maxwell‘s equations, one has that 
D PP oe 3E 
ate e at 
Differentiating twice with respect to t allows one to substitute for s in terms of 
P and therefore E, so that eliminate s altogether to obtain a dispersion relation 
for the waves. 


3? eB ð j j 1 3E Uopye? 3'E 
— —i—— VE- = i A.65 
| m X 0] ( c? 3t? ) m ðt? ( ) 


(A.64) 


For linearly polarized plane waves, it then follows that 


B 2 2E 2 
(o E aoe oà) (-« + =) paa Nre.. ele) 
m Ẹ m 


This is the dispersion relation. The ij complex nature is a direct result of the 
coupling to the magnetic field. Re-arranging: 


o ; Lopnerc?/m 
(—a@? + jow + w2) 


| (w > @). (A.67) 


The wavevector is therefore a complex number. Writing the wavenumber with 
real and imaginary parts separated: 


k =k, — ijk; (ki > 0), (A.68) 
one can substitute back into the plane wave: 
E = Eo cos(kz — wt) = Re Eg 6672n 
= ReEo exp j(k,z — wt + ijkiz) 
= Eo exp(ikjz)Re expG(k,z — @t)) 
E = Eo - [cos(k,z) +1sin(k,z)] - cos(k,z — ot). 
——— ee 


rotation « z travelling wave 


(A.69) 


Thus, we have a clockwise rotation of the polarization plane. 
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A.7 Vector identities in n = 3 dimensions 


For general vectors A and B, and scalar ¢, 


V: (9A)=¢(V-A)+A-;(Vọ) (A.70) 
V-(A x B) =B.(V x A)—A-(V xB) (A.71) 
Vx ¢A=6(V x A) + (VO) x A (A.72) 

Vx Vo =0 (A.73) 
V-(VxA)=0 (A.74) 
(V x (V x A)) = V(V- A) — V7A. (A.75) 


A.8 The Stokes and Gauss theorems 


Stokes’ theorem in three spatial dimensions states that 


[ov xayas=f aval, (A.76) 
R c 


i.e. the integral over a surface region R of the curl of a vector, also called the 
flux of the curl of that vector, is equal to the value of the vector integrated along 
a loop which encloses the region. 

The Gauss divergence theorem in three-dimensional vector language states 
that 


/ (V -A)do, = J A - dS; (A.77) 
o S 


i.e. the integral over a spatial volume, o, of the divergence of a vector is equal to 
the integral over the surface enclosing the volume of the vector itself. In index 
notation this takes on the trivial form: 


J dod‘ A; = J dS‘ A;, (A.78) 


and the spacetime generalization to n + 1 dimensions (which we use frequently) 
is 


fo ae = f iosu (A.79) 
Notice that Gauss’ law is really just the generalization of integration by parts in a 


multi-dimensional context. In action expressions we frequently use the quantity 
(dx) = dV; = tdVy, whence 


fæ “A, = L | do" Ap. (A.80) 
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A.9 Integrating factors 


Differential equations of the form 


dy 
qt fy = 8) (A.81) 
X 


can often be solved by multiplying through by a factor J (x) 


d 
I(x) z + fx) L(x) y = g(x) IQ), (A.82) 


which makes the left hand side a perfect differential: 


d(uv) dV du 
= f A.83 
dx dx = dx ( ) 


Comparing these equations and identifying u = J and v = y, one finds 


dI 
a TAER (A.84) 
x 
which solves to give 
I(x) = exp (f ff’) ax’) ; (A.85) 
0 


Thus the differential equation (A.81) may be written 
d 
a (I (x)y) = gx) I(x). (A.86) 
A.10 Matrix formulae 


The so-called Baker-Campbell—Hausdorf identity for non-singular matrices A 
and B states that 


1 1 
e4Bel = B + rie A]+ 5B, Al, Ale ees (A.87) 


A.11 Matrix factorization 


A formula which is useful in diagonalizing systems is: 
A, A \_ { A,—AA;'B AA ' 1 0 
( B ^ ) z ( 0 1 B A J’ Ae 


A A _ 
det ( B Ay ) = det(A; — AA5'B)detA. (A.89) 
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Recommended reading 


J.M. Cassels, Basic Quantum Mechanics (2nd edition). Macmillan Press, 
London (1970). An excellent summary of basic quantum mechanics. 


B. DeWitt, Dynamical Theory of Groups and Fields. Gordon and Breach, 
New York (1965). This demanding book contains deep insights into basic 
field theory, prior to the understanding of non-Abelian gauge theories. 
There is no other book like it. Metric conventions are the same as in this 
book. 


K. Huang, Statistical Mechanics. John Wiley and Sons, New York (1963). 
A classic book on statistical mechanics, which details the foundations of 
the subject, in a scholarly fashion, prior to the renormalization group era. 


H.F. Jones, Groups, Representations and Physics (2nd edition). Institute 
of Physics IoP Press, Bristol (1998). A very nice introduction to group 
theory for physicists, with much more attention to relevant detail than 
most group theory texts. A very nice summary of Dirac notation. 


S. Schweber, Relativistic Quantum Field Theory, Harper & Row, New 
York (1961). Although a little dated, this is still one of the most scholarly 
books on quantum field theory. It is one of the few books which answers 
more probing questions than it raises about the formulation of field theory. 
This book cannot be praised highly enough. The opposite metric signature 
is used. 


J. Schwinger, Particles, Sources and Fields, Volume I. Addison Wesley, 
Redwood, CA (1970). This book is Schwinger’s motivation for, and 
treatise on, source theory, which is a formulation of effective quantum 
field theory. This is a classic work, which is full of important insights for 
the dedicated reader. The conventions are largely the same as those used 
here. 
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J. Schwinger, L.L. DeRaad, K.A. Milton and W. Tsai, Classical Elec- 
trodynamics, Perseus, Reading MA (1998). A long awaited book on the 
Green function approach to classical electrodynamics. Alas, it uses old 
gaussian units, which can be confusing with regard to dimensions and 
factors of c. Notations otherwise resemble those used here. 


B. Schutz, Geometrical Methods in Mathematical Physics. Cambridge 
University Press (1980). A uniquely readable, and unpretentious, intro- 
duction to geometrical methods with carefully crafted examples. 


S. Weinberg, Gravitation and Cosmology. J. Wiley and Sons, New York 
(1972). An excellent introduction to the general theory of relativity and 
its influence on physics. The conventions used are the same as those used 
in this book. 


S. Weinberg, Quantum Theory of Fields, Volume I, Cambridge University 
Press (1995). A new book, which takes over where Schweber leaves 
off and one of the few books on quantum field theory which tries to 
explain what field theory is really about. A must for any field theorist. 
Conventions are similar to this book, but the Lagrangian functions differ 
by an overall sign. 
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state independent Green function, 126 
stationary waves, 74 
statistical expectation values, 373 
statistical mechanics, 372 
step function, 503 
Stokes’ theorem, 511 
structure constants, 183, 468 
sub-groups, 170 
SU (2), 476 
substantive derivative, 156, 246 
summation convention, 404 
susceptibility, 352 
Green function, 82 
Magnetic, 460 
thermal, 124 
symmetry breaking 
by boundary condition, 253 
dynamical, 281 
global, 274 
local, 278 
spontaneous, 130, 336 
symmetry, Hamiltonian view, 360 
symplectic coordinates, 362 
symplectic transformations, 360 
synchrotron radiation, 148 


tangent space, 36 

TCP theorem, 263 
thermal conductivity, 317 
thermal susceptibility, 124 
time, special role of, 358 


time-ordered products, 114 
time-reversal transformation, 209 
time-translation generator, 365 
trace of energy-momentum tensor, 300 
transformation 

of coordinates, 207 

of group vectors, 171 
transformation function, 380 
translation in periodic lattice, 213 
transversality, 236 
transverse components, 43 
travelling waves, 74 
triality, 205 


unitarity, 88 

and macrostate, 126 
unitary gauge, 280 
unitary matrices, 199 
units, defined, 399 
universal cover group, 203 


vacuum, 4 

variation 
classical dynamical variables, 365 
dynamical, 68 
gauge-invariant, 70 
non-dynamical, 68 

variation of an operator, 380 
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vector 
length of, 38 
potential, 12 
product, 40 
Verdet’s constant, 142, 510 
vielbein, 300 
virtual processes, 121 
viscosity, 317 
vortices, 164 


wavefunction, 376 
gauge transformation, 264 
wavenumber k,,, 42 
waves, electromagnetic, 18 
Weyl spinors, 441 
Wick rotation, 47, 48, 113 
Wightman functions, 83, 88 
Green functions, 91 
n = 3,101 
Wilson loop, 213 
world-lines, 57 


Yang-Mills theory, 467 
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Zeeman effect, 139 
Zitterbewegung, 351 


